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TRANSMITTAL LETTER TO THE UNITED STATES 
DESIGNATED/ELECTED OFFICE (DO/EO/US) 
CONCERNING A FILING UNDER 35 U.S.C. 371 



ATTORNErS DOCKET NUMBER 

078883-0137 



U.S. APPLICATION NO. (If known,j 

To Be Assigned 



INTERNATIONAL APPLICATION NO. 
PCT/GBOO/01002 



INTERNATIONAL FILING DATE 
17 March 2000 



[PRIORITY DATE CLAIMED 
17 March 1999 



rriTLE OF INVENTION 
ANTI-VIRAL VECTORS 



APPLICANT(S) FOR DO/EO/US 

Mark UDEN and Kyrlacos MITROPHANOUS 



Applicant herewith submits to the United states uesignated/blected urtice (UU/bU/Uii) tne roiiowing items and other intormation: 

1 . g| This is a FIRST submission of items conceming a filing under 35 U.S.C. 371 . 

2. n This is a SECOND or SUBSEQUENT submission of items concerning a filing under 35 U.S.C. 371 , 



□ 
□ 



□ 
13 



18. □ 
^- □ 
10. □ 



This express request to begin national examination procedures (35 U.S.C. 371 (f)) at any time rather than delay 
examination until the expiration of the applicable time limit set in 35 U.S.C. 371 (b) and PCT Articles 22 and 39(1 ). 

A proper Demand for Intemational Preliminary Examination was made by the 19*^ month from the earliest claimed 
priority date. 

A copy of the intemational Application as filed (35 U.S.C. 371(c)(2)) 

^ is transmitted herewith (required only if not transmitted by the Intemational Bureau). 

□ has been transmitted by the Intemational Bureau. 

□ is not required, as the application was filed in the United States Receiving Office (RO/US) 
A translation of the Intemational Application into English (35 U.S.C. 371(c)(2)). 

Amendments to the claims of the Internationa! Application under PCT Article 1 9 (35 U.S.C. 371 (c)(3)) 

□ are transmitted herewith (required only if not transmitted by the Intemational Bureau). 

□ have been transmitted by the Intemational Bureau. 

□ have not been made; however, the time limit for making such amendments has NOT expired. 
^ have not been made and will not be made. 

A translation of the amendments to the claims under PCT Article 19 (35 U.S.C. 371 (c)(3)). 
An oath or declaration of the inventor(s) (35 U.S.C. 371(c)(4)). 

A translation of the annexes to the Intemational Preliminary Examination Report under PCT Article 36 (35 U.S.C. 
371(c)(5)). 



1. □ Applicant claims small entity status under 37 CFR 1 .27 . 



Items 12. to 17. below concern other document(s) or infomnation included: 

An Infomnation Disclosure Statement under 37 CFR 1.97 and 1 .98. 

An assignment document for recording. A separate cover sheet in compliance with 37 CFR 3.28 and 3.31 is included 

A FIRST preliminary amendment 
A SECOND or SUBSEQUENT preliminary amendment. 

A substitute specification. 

A change of power of attorney and/or address letter. 

Other items or infomnation: Copy of Sequence Listing with the Application (10 pages) 



12. 




13. 


□ 


14. 


□ 


15. 


□ 


16. 


□ 


17. 
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INTERNATIONAL APPLICATION NO. 
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18. ^The following fees are submitted: 



CALCULATIONS use only 



Basic National Fee (37 CFR 1.492(a)(1H5): 

Search Report has been prepared by the EPO or JPO $860.00 



International preliminary examination fee paid to USPTO 

(37 CFR 1.482) $690.00 



No intennationai preliminary examination fee paid to USPTO (37 CFR 1.482) 
but intemational search fee paid to USPTO (37 CFR 1.445(a)(2) $710.00 



Neither intemational preliminary examination fee (37 CFR 1.482) nor 
Intemational search fee (37 CFR 1 .445(a)(2)) paid to USPTO $1 ,000.00 



Intemational preliminary examination fee paid to USPTO (37 CFR 1.482) 

and all claims satisfied provisions of PCT Article 33(2)-(4) $100.00 



ENTER APPROPRIATE BASIC FEE AMOUNT = 



$860.00 



Surcharge of $1 30.00 for furnishing the oath or declaration later than 20 
Months from the earliest claimed priority date (37 CFR 1.492(e)) 



$0.00 



Claims 



Number Filed 



Included in Basic 
Fee 



Extra 
Claims 



Rate 



Total Claims 



22 



20 



$18.00 



$36.00 



C Independent 
;r Claims 



$80.00 



$0.00 



J Multiple dependent claim(s) (if applicable) 



$270.00 



TOTAL OF ABOVE CALCULATIONS = 



$896.00 



Reduction by 'A for filing by small entity, if applicable. 



$0.00 



SUBTOTAL = 



$896.00 



^^Processing fee of $130.00 for furnishing English translation later the 20 
months from the earliest claimed priority date (37 CFR 1 .492(f). 



TOTAL NATIONAL FEE 



$896.00 



I -Fee for recording the enclosed assignment (37 CFR 1 .21 (h)). The assignment must be 
^'Accompanied by an appropriate cover sheet (37 CFR 3.28, 3.31). $40.00 per property + 



TOTAL FEES ENCLOSED = 



$896.00 



Amount to be: 
refunded 



$ 



charged $ 



a. ^ A check in the amount of $896.00 to cover the above fees is enclosed. 

b. □ Please charge my Deposit Account No. 19-0741 in the amount of $0.00 to the above fees. A duplicate copy of this sheet is 
enclosed. 

c. ^ The Commissioner is hereby authorized to charge any additional fees which may be required, or credit any 
overpayment to Deposit Account No. 19-0741 . A duplicate copy of this sheet is enclosed. 

NOTE: Where an appropriate time limit under 37 CFR 1 .494 or 1 .495 has not been met, a petition to revive (37 CFR 
1 .1 37(a) or (b)) must be filed and granted to restore the application to pending status. 



SEND all correspondence TO: 



Foley & Lardner 
Washington Harbour 
3000 K Street, N.W., Suite 500 
Washington, D.C. 20007-5109 



SIGNATURI 




NAME BERNHARD D. SaXE 
REGISTRATION NUMBER 28,665 

Septembers, 2001 



IN THE UNITED STATES PATENT AND TRADEMj^K OFFICE 

Atty- Docket No: 078883/0137 

In re patent application of 
UDEN, MARK et ai . 
Serial No. 09/936,572 
Filed: September 14, 20 01 
For: ANTI -VIRAL VECTORS 

STATEMENT TO SUPPORT FILING AND SUBMISSION IN 
ACCORDANCE WITH 37 C.F.R. §§ 1.821-1.825 



Assistant Commissioner for Patents 
Washington, D.C, 20231 
Box SEQUENCE 

Sir: 

In connection with a Sequence Listing submitted concurrently 
herewith, the undersigned hereby states that: 

1. the submission, filed herewith in accordance with 37 
C.F.R. § 1.821(g), does not include new matter; 

2 . the content of the attached paper copy and the 
attached computer readable copy of the Sequence Listing, submitted in 
accordance with 37 C.F.R. § 1.821(c) and (e) , respectively, are the same; 
and 

3 . all statements made herein of their own knowledge are 
true and that all statements made on information and belief are believed to 
be true; and further, that these statements were made with the knowledge 
that willful false statements and the like so made are punishable by fine 
or imprisonment, or both, under Section 10 01 of Title 18 of the United 



Serial No. 09/936,572 

States Code and that such willful false statements may jeopardize the 
validity of the application or any patent resulting therefrom. 



HARBOR CONSULTING 

Intellectual Property Services 
150 OA Lafayette Road 
Suite 262 
Portsmouth, N.H. 
800-318-3021 



Respectfully submitted, 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicant: 



Mark UDEN et al. 



Title: 



ANTI-VIRAL VECTORS 



Appl. No.: 



09/936,572 



Filing Date: 



September 1 7, 2001 



Examiner: 



Unassigned 



Art Unit: 



Unassigned 



PRELIMINARY AMENDMENT 



Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Prior to examination, please amend this application as follows. 
IN THE FIGURES : 

Please replace Figs. 3-6 and 9-13 with the enclosed marked-up version of the 
Figures. 

IIM THE SPECIFICATION : 

In accordance with 37 C.F.R. § 1.121, please replace the following paragraphs 
with the identified rewritten paragraphs of the application. The changes are shown 
explicitly in the attached "Version with Markings to Show Changes Made." 

Page 22, please replace the fourth paragraph with the following: 
-The ribozymes are hammerhead (Riddel! et a/., 1996) structures of the following 
general structure: 

Helix I Helix II Helix III 

5'-NNNNNNNN~ CUGAUGAGGCCGAAAGGCCGAA -NNNNNNNN- 



(SEQ ID NO: 15)- 



Atty. Dkt, No. 078883/0137 
Appln. No. 09/936,572 

Please replace the paragraph bridging pages 22-23 with the following: 
—The cleavage sites, targeting gag and po/, with the essentia! GUX triplet {where X is 
any nucleotide base) are as follows: 



GAG 1 


5' 


UAGUAAGAAUGUAUAGCCCUAC (SEQ ID NO: 16) 


GAG 2 


5' 


AACCCAGAUUGUAAGACUAUUU (SEQ ID NO: 17) 


GAG 3 


5' 


UGUUUCAAUUGUGGCAAAGAAG (SEQ ID NO: 18) 


GAG 4 


5' 


AAAAAGGGCUGUUGGAAAUGUG (SEQ ID NO: 19) 


POL 1 


5' 


ACGACCCCUCGUCACAAUAAAG (SEQ ID NO: 20) 


POL 2 


5' 


GGAAUUGGAGGUUUUAUCAAAG (SEQ ID NO: 21) 


POL 3 


5' 


AUAUUUUUCAGUUCCCUUAGAU (SEQ ID NO. 22) 


POL 4 


5' 


UGGAUGAUUUGUAUGUAGGAUC (SEQ ID NO: 23) 


POL 5 


5' 


CUUUGGAUGGGUUAUGAACUCC (SEQ ID NO: 24) 


POL 6 


5' 


CAGCUGGACUGUCAAUGACAUA (SEQ ID NO: 25) 


POL 7 


5' 


AACUUUCUAUGUAGAUGGGGCA (SEQ ID NO: 26) 


POL 8 


5' 


AAGGCCGCCUGUUGGUGGGCAG (SEQ ID NO: 27) 


POL 9 


5' 


UAAGACAGCAGUACAAAUGGCA (SEQ ID NO: 28) 



Page 23, please replace the second full paragraph with the following: 
"The HCMV/HIV-1 hybrid 3' LTR is created by recombinant PGR with three PGR 
prinners (Figure 2). The first round of PGR is performed with RIB1 and RIB2 using pH4 
(Kim ef a/., 1998) as the template to amplify the HIV-1 HXB2 sequence 8900-9123. 
The second round of PGR makes the junction between the 4' end of the HIV-1 U3 and 
the HGMV promoter by amplifying the hybrid 5' LTR from pH4. The PGR product from 
the first PGR reaction and RIB3 serves as the 5' primer and 3' primer respectively. 

RIB1: 5' GAGGTGGTGGAGGAGGTGAAGGTTGGATGG 3' (SEQ ID NO: 29) 
RIB2: 5' GTAAGTTATGTAAGGGAGGATATGTTGTGTTGTT 3' (SEQ ID NO: 30) 
RIB3: 5' GGGATAGTGGAGGGGGGGGGGAGTGGTAGAGATTTTC 3' (SEQ ID NO: 31)-- 
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Atty, Dkt. No. 078883/0137 
Appln. No. 09/936,572 

Please replace the paragraph bridging pages 27 and 28 with the following: 
"Egs 1/1A (SEQ ID NO. 5) 

(SEQ ID NO: 5) 5'-tcgagcccggggatgacgtcatcgacttcgaaggttcgaatccttctactgccaccatttttt 
cgggcccctactgcagtagctgaagcttccaagcttaggaagatgacggtggtaaaaaa 
ctctacgtcatcgacttcgaaggttcgaatccttccctgtccaccagtcgacc-3' 

gagatgcagtagctgaagcttccaagcttaggaagggacaggtggtcagctggagct-5' (SEQ ID NO: 32) 
Egs 2/2A (SEQ ID NO. 6) 

(SEQ ID NO. 6) 5'-tcgagtattacgtcatcgacttcgaaggttcgaatccttctagattcaccattttttaggaacg 
cataatgcagtagctgaagcttccaagcttaggaagtactaagtggtaaaaaatccttgc 
tcatcgacttcgaaggttcgaatccttccagttccaccagtcgacc-3' 
^ agtagctgaagcttccaagcttaggaaggtcaaggtggtcagctggagct-5' (SEQ ID NO. 33) 

Uj Egs 3/3A (SEQ ID NO. 7) 

(SEQ ID NO. 7) 5'-tcgaggccaacgtcatcgacttcgaaggttcgaatccttctcttcccaccattttttttcc 
llj ccggttgcagtagctgaagcttccaagcttaggaagagaagggtggtaaaaaaaagg 
r ctgaacgtcatcgacttcgaaggttcgaatccttctgctgtcaccagtcgacc-3' 

O gacttgcagtagctgaagcttccaagcttaggaagacgacagtggtcagctggagct-5' (SEQ ID NO. 34) 

Egs 4/4 (SEQ ID NO. 8) 

(SEQ ID NO. 8) 5'-tcgagggctacgtcatcgacttcgaaggttcgaatccttcttgcttcaccatttttt 

cccgatgcagtagctgaatgcttccaagcttaggaagaacgaagtggtaaaaaa 
ctgaacgtcatcgacttcgaaggttcgaatccttctgctgtcaccagtcgacc-3' 

gacttgcagtagctgaagcttccaagcttaggaagacgacagtggtcagctggagct-5' (SEQ ID NO. 35) 
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Egs 5/5A (SEQ ID NO. 9) 

(SEQ ID NO. 8) 5^-tcgagtataacgtcatcgacttcgaaggttcgaatccttcaccggtcaccatttttttata 
catattgcagtagctgaagcttccaagcttaggaagtggccagtggtaaaaaaatat 
acgtcatcgacttcgaaggttcgaatccttcttcttacaccagtcgacc-3' 

tgcagtagctgaagcttccaagcttaggaagaagaatgtggtcagctggagct-5' (SEQ ID NO. 36) 
Egs 6/6A {SEQ ID NO. 10) 

(SEQ ID NO. 10) 5'-tcgaggtacacgtcatcgacttcgaaggttcgaatccttcgtagttcaccattttttgtgc 
ccatgtgcagtagctgaagcttccaagcttaggaagcatcaagtggtaaaaaacacg 
acgtcatcgacttcgaaggttcgaatccttctaggcccaccagtcgacgcatgcc-3' 

tgcagtagctgaagcttccaagcttaggaagatccgggtggtcagctgcgtacggagct-5' (SEQ ID NO. 37) — 

REMARKS 

Formal examination of this application is respectfully requested. 

Figures 3-6 and 9-13 and the specification were amended to recite sequence ID 
numbers for the listed sequences. 

As the foregoing amendments do not introduce new matter, entry thereof by the 
Examiner is respectfully requested. 

The Commissioner is hereby authorized to charge any additional fees which may 
be required regarding this application under 37 C.F.R. §§ 1.16-1.17, or credit any 
overpayment, to Deposit Account No. 19-0741 . 
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Atty. Dkt. No. 078883/0137 
Appln. No. 09/936,572 

Should no proper payment be enclosed herewith, as by a check being in the 
wrong amount, unsigned, post-dated, otherwise improper or informal or even entirely 
missing, the Commissioner is authorized to charge the unpaid amount to Deposit 
Account No. 19-0741 . 



Date: December 1 1 . 2001 

FOLEY & LARDNER 
Washington Harbour 
3000 K Street, N.W., Suite 500 
Washington, D.C. 20007-5109 
Telephone: (202) 672-5538 
Facsimile: (202) 672-5399 



Respectfully submitted, 

Mlchele M. Simkin 
Attorney for Applicant 
Registration No. 34,717 



Atty. Dkt. No. 078883/0137 
Appln. No. 09/936,572 

^^ Version of the Specification with Markings to Show Changes Made " 

Page 22, please replace the fourth paragraph with the following: 
-The ribozymes are hammerhead (Ridde!! et al., 1996) structures of the following 
general structure: 

Helix 1 Helix II Helix III 

5'-NNNNNNNN- CUGAUGAGGCCGAAAGGCCGAA -NNNNNNNN-^ 

(SEQ ID NO: 15) - 

Please replace the paragraph bridging pages 22-23 with the following: 
—The cleavage sites, targeting gag and po/, with the essential GUX triplet (where X is 



any nucleotide base) are as follows: 



GAG 1 


5' 


UAGUAAGAAUGUAUAGCCCUAC (SEQ ID NO: 16) 


GAG 2 


5' 


AACCCAGAUUGUAAGACUAUUU (SEQ ID NO: 17) 


GAG 3 


5' 


UGUUUCAAUUGUGGCAAAGAAG (SEQ ID NO: 


18) 


GAG 4 


5' 


AAAAAGGGCUGUUGGAAAUGUG (SEQ ID NO 


19) 


POL 1 


5' 


ACGACCCCUCGUCACAAUAAAG (SEQ ID NO: 


20) 


POL 2 


5' 


GGAAUUGGAGGUUUUAUCAAAG (SEQ ID NO: 


21) 


POL 3 


5' 


AUAUUUUUCAGUUCCCUUAGAU (SEQ ID NO. 


22) 


POL 4 


5' 


UGGAUGAUUUGUAUGUAGGAUC (SEQ ID NO: 


23) 


POL 5 


5' 


CUUUGGAUGGGUUAUGAACUCC (SEQ ID NO: 


24) 


POL 6 


5' 


CAGCUGGACUGUCAAUGACAUA (SEQ ID NO: 


25) 


POL 7 


5' 


AACUUUCUAUGUAGAUGGGGCA (SEQ ID NO: 


26) 


POL 8 


5' 


AAGGCCGCCUGUUGGUGGGCAG (SEQ ID NO: 


27) 


POL 9 


5' 


UAAGACAGCAGUACAAAUGGCA (SEQ ID NO: 


28)- 



Page 23, please replace the second full paragraph with the following: 
-The HCMV/HiV-1 hybrid 3' LTR is created by recombinant PGR with three PGR 
primers {Figure 2). The first round of PGR is performed with RIBl and RIB2 using pH4 
(Kim et aL, 1998) as the template to amplify the HIV-1 HXB2 sequence 8900-9123. 
The second round of PGR makes the junction between the 4' end of the HIV-1 U3 and 
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the HCMV promoter by amplifying the hybrid 5' LTR from pH4. The PCR product from 
the first PCR reaction and RIBS serves as the 5' primer and 3' primer respectively. 

RIB1: 5' CAGCTGCTCGAGCAGCTGAAGCTTGCATGC 3' (SEQ ID NO: 29) 
RIB2: 5' GTAAGTTATGTAACGGACGATATCTTGTCTTCTT 3' (SEQ ID NO: 30) 
RIBS: 5' CGCATAGTCGACGGGCCCGCCACTGCTAGAGATTTTC 3' (SEQ ID NO: 31) -- 
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Please replace the paragraph bridging pages 27 and 28 with the following: 
"Egs 1/1A (SEQ ID NO. 5) 

(SEQ ID NO: 5) 5'-tcgagcccggggatgacgtcatcgacttcgaaggttcgaatccttctactgccaccatttttt 
cgggcccctactgcagtagctgaagcttccaagcttaggaagatgacggtggtaaaaaa 
ctctacgtcatcgacttcgaaggttcgaatccttccctgtccaccagtcgacc-3' 

gagatgcagtagctgaagcttccaagcttaggaagggacaggtggtcagctggagct-5' (SEQ ID NO: 32) 
Egs 2/2A (SEQ ID NO. 6) 

(SEQ ID NO. 6) 5'-tcgagtattacgtcatcgacttcgaaggttcgaatccttctagattcaccattttttaggaacg 
cataatgcagtagctgaagcttccaagcttaggaagtactaagtggtaaaaaatccttgc 
tcatcgacttcgaaggttcgaatccttccagttccaccagtcgacc-3' 

agtagctgaagcttccaagcttaggaaggtcaaggtggtcagctggagct-5' (SEQ ID NO. 33) 
Egs 3/3A (SEQ ID NO. 7) 

(SEQ ID NO. 7) 5'-tcgaggccaacgtcatcgacttcgaaggttcgaatccttGtcttcccaccattttttttcc 

ccggttgcagtagctgaagcttccaagcttaggaagagaagggtggtaaaaaaaagg 
ctgaacgtcatcgacttcgaaggttcgaatccttctgctgtcaccagtcgacc-3' 

gacttgcagtagctgaagcttccaagcttaggaagacgacagtggtcagctggagct-5' (SEQ ID NO. 34) 
Egs 4/4 (SEQ ID NO. 8) 

(SEQ ID NO. 8) 5'-tcgagggctacgtcatcgacttcgaaggttcgaatccttcttgcttcaccatttttt 

cccgatgcagtagctgaatgcttccaagcttaggaagaacgaagtggtaaaaaa 
ctgaacgtcatcgacttcgaaggttcgaatccttctgctgtcaccagtcgacc-3' 

gacttgcagtagctgaagcttccaagcttaggaagacgacagtggtcagctggagct-5' (SEQ ID NO. 35) 
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Egs 5/5A (SEQ ID NO. 9) 

(SEQ ID NO. 8) 5'-tcgagtataacgtcatcgacttcgaaggttcgaatccttcaccggtcaccatttttttata 
catattgcagtagctgaagcttccaagcttaggaagtggccagtggtaaaaaaatat 
acgtcatcgacttcgaaggttcgaatccttcttcttacaccagtcgacc-3' 

tgcagtagctgaagcttccaagcttaggaagaagaatgtggtcagctggagct-5' (SEQ ID NO. 36) 
Egs 6/6A (SEQ ID NO. 10) 

(SEQ ID NO. 10) 5'-tcgaggtacacgtcatcgacttcgaaggttcgaatccttcgtagttcaccattttttgtgc 
ccatgtgcagtagctgaagcttccaagcttaggaagcatcaagtggtaaaaaacacg 
acgtcatcgacttcgaaggttcgaatccttctaggcccaccagtcgacgcatgcc-3' 

tgcagtagctgaagcttccaagcttaggaagatccgggtggtcagctgcgtacggagct-5' (SEQ ID NO. 37) — 



wo 00/55341 



PCT/GB00/010Q2 



Figure 3 



ga.gpol-HX32 Codoa Usage . , . 

DN?c sequence 43 03 b,p. ATGGGTGCGAGA GATGAGGATTAG 



143 S codoas 

MW : 161929 Da.lt:on CAItS.c.) : 0.033 CAX{£.c.) : 0.151 

TTT phe F 21 TCT ser S 3 TAT tyir Y 3 0 TGT cys C 13 

TTC phe T 14 TCC ser S 3 TAC tyr Y 9 TGC cys C 2 

TTA leu L 4S TCA ser S 19 TAA OCH Z - TGA OPA Z 

TTG leu L 11 TCG ser S 1 TAG AKB Z 1 TGG trp W 37 

CTT leu L 13 CCT pro P 21 CAT his K 20 CGT a.rg R 

CTC leu L 7 CCC pro P 14 CAC his H 7 CGC arg R ' - 

CTA leu L 17 CCA pro P 41 CAA glr. Q 5 5 CGA arg H. 3 

CTG leu L IS CCG pro P - CAG gin Q 3 9 CGG arg R 3 

ATT ile I 30 ACT thr T 24 AAT asn N 42 AGT ser S IS 

ATCila I 14 ACC thr T 20 AAC asn N Ifi AGC ser S IS 

ATA ile I 5S ACA thr T 43 AAA lys K 33 AGA arg R 45 

ATG met M 29 ACG thr T 1 AAG lys K 34 AGG arg R 13 

GTT va.1 V 15 GCT ala A 17 GAT asp D 3 7 GGT gly G 11 

GTC val V 11 'gcC ala A 19 GAC asp D 26 GGC gly G 10 

GTA val V 55 GCA ala A 55 GAA giu E 75 GGA gly G SI 

GTG val V 15 GCG ala A 5 GAG glu E 32 GGG gly G 2 5 
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Figure ^ 



^aspcl-S'flJoP U to 430ai -> C=^doa^U=aae^^^ ^^^^ 

DNA sequence 43 0 3 b.p. XTGGGCGCCCGC ... GATGAGGATTAG linear 



143 S codons 



MW : 1S1929 Dal ton 


CAI(S. 


c.} 


: 0.080 CAXCS.c 


) : 0.29o 




TTT phe 




5 


TCT 


ser 


S 


5 


TAT tyr Y 


10 


TGT cys C 


G 


TTC pha 


F 


30 


TCC 


ser 


s 


11 


TAG tyr Y 


29 


TGC cys C 


14 


TTA leu 


L 


2 


TCA 


ser 


s 


4 


TAA OCH Z 




TGA 09A Z 




TTG leu 


L 


7 


TCG 


ser 


s 


S 


TAG AMB Z 


1 


TGG trp W 


37 


CTT leu 


L 


3 


CCT 


pro 




14 


CAT his H 




CGT arg R 


2 


CTC leu 


L 


22 


ccc 


pro 


V 


39 


CAC his K 


21 


CGC arg R 


34 


CTA leu 


L 


S 


CCA 


pr"o 


p 


10 


CAA gin Q 


14 


CGA arg R 


3 


CTG leu 


L 


70 


CCG 


pro 


p 


13 


CAG gin Q 


SI 


CGG arg R 


10 


ATT lie 


I 


17 


ACT 


thr 


T 


11 


AAT asn N 


13 


AGT ser S 


7 


ATC ile 


X 


79 


ACC 


thr 


T 


43 


AAC asn N 


45 


AGC ser S 


27 


ATA ile 


I 


4 


ACA 


thr 


T 


13 


AAA lys K 


25 


AGA arg R 


7 


ATG met 


M 


29 


ACG 


chr 


T 


IS 


AAG lys K 


97 


AGG surg R 


13 


GTT val 


V 


5 


GCT 


ala 


A 


15 


GAT asp D 


19 


GGT gly G 


10 


GTC Vil 


V 


27 


GCC 


aJLa. 


A 


55 


GAC asp D 


44 


GGC gly G 


54 


GTA val 


V 


5 


GCA 


ala 


A 


13 


GAA glu E 


29 


GGA gly G 


IS 


GTG vail 


V 


58 


GCG 


ala 


A 


12 


GAG glu E 


73 


GGG gly G 


23 
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Figure 



env-mn [1 to 2571] -> Codon Usag-a . x /^rf\ a ■ //liN 

DNA sequence 2571 b.p. ATGAGA5TGAAG ... GCTTTGCTATAA linear 



857 codons 



MW : 


97078 Dalcon 


CAI(S 


c.) 


: 0.Q33 CAICE.c 


.} : 


0 .140 
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ATT ile 


I 


21 


ACT 
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AGT 
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ATC ile 
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th.r 
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GTT val 


V 


3 


GCT 


ala 


A 


16 


GAT asp D 


13 


GGT 


gly G 


10 
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Figure 6 



SYMgpl 6 Oicin -> Codon Usage 



DNA sequence 25 71 b.p. ATGAGGGTGAAG ... GCGCTGCTGTAA 



357 codoris 



M!W : 97073 E 

TTT ph.e F 

TTC pile F 24 

TTA leu L - 
TTG leu L 

CTT leu L - 

CTC leu L 20 

CTA leu L 1 

CTG leu L S3 

ATT ile r 2 

ATC ile I SI 

ATA ile I . - 

ATG laec M 17 



.tan CAI(S,c,) 

TCT ser S 2 
TCC ser S 4 
TCA ser S 
TCG ser S 

CCT pro P 

CCC pro P 26 

CCA pro P 

CCG pro P 2 

ACT tiir T 

ACC thr T 59 

ACA thr T 

ACG th.r T 4 



: 0 .074 CAKE. 

TAT tyr Y 1 

TAG tyr Y 21 

TAA OCK Z 1 
TAG AMB Z 

CAT his K 2 

CAC his H 12 
CAA gin Q 

CAG gin Q 41 

AAT asn N 2 

AAC asn N SI 

AAA lys K 1 

AAG lys K 45 



.) : 0.419 

TGT cys C 

TGC cys C 21 

TGA QPA 2 

TGG trp W 3-Q 

CGT a.rg R 1 

CGC arg R 36 
CGA arg R 

CGG arg R 4 

AGT ser S 

AGC ser S 43 

AGA arg R 2 

AGG arg R 6 



GTT val V 

GTC val V 1 

GTA val V 1 

GTG val V 53 



GCT ala A . - 

GCC ala A 4 0 
GCA ala A 

GCG aia A 3 



GAT asp D 2 

GAC asp D 3 0 

GAA glu E 3 

GAG glu Z 43 



GGT gly G 1 

GGC gly G 47 
GGA gly G 

GGG gly G 3 
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Figure 9 A 
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Figure 9 B 



Generic design of EGSs to target any RNA. 
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Atty. Dkt. No. 078883/0137 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Applicant: 


MarkUDENetal. 


Title: 


ANTI-VIRAL VECTORS 


AppL No.: 


Unassigned 


Filing Date: 


September 17, 2001 


Examiner: 


Unassigned 


Art Unit: 


Unassigned 



PRELIMINARY AMENDMENT 

Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Prior to exanfiination, Applicants respectfully request that the above-identified 
application be amended as follows: 

IN THE CLAIMS: 

Please cancel claims 22-23 without prejudice or disclaimer. 

In accordance with 37 C.F.R. § L21, please substitute for claims 5, 8, 11-14, 16-19 
and 21 the following rewritten versions of the same claims, as amended. The changes are 
shown explicitly in the attached "Versions with Markings to Show Changes Made," 

What Is Claimed Is: 

5. A system according to claim 1 wherein the viral vector is a retroviral vector. 

8, A system according to claim 5 wherein the polypeptide required for the 
assembly of viral particles is selected from gag, pol and env proteins. 

-1- 
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11. A system according to claim 9 wherein the lentivims is HIV, 

12. A system according to claim 1 wherein the third nucleotide sequence is 
resistant to cleavage directed by the gene product as a result of one or more conservative 
alterations in the nucleotide sequence which remove cleavage sites recognised by the at least 
one gene product and/or binding sites for the at least one gene product. 

13. A system according to claim 1 wherein the third nucleotide sequence is 
adapted to be resistant to cleavage by the at least one gene product. 

14. A system according to claim 1 wherein the third nucleotide sequence is codon 
optimised for expression in producer cells. 

16. A system according to claim 1 comprising a plurality of first nucleotide 
sequences and third nucleotide sequences as defined therein. 

17. A viral particle comprising a viral vector genome as defined in claim 3 and 
one or more third nucleotide sequences as defined in claim 3, 

18. A viral particle produced using a viral vector production system according to 
claim 3. 

19. A method for producing a viral particle which method comprises introducing 
into a host cell (i) a viral genome as defined in claim 3 (ii) one or more third nucleotide 
sequences as defined in claim 3 and (iii) nucleotide sequences encoding the other essential 
viral packaging components not encoded by the one or more third nucleotide sequences. 

21. A pharmaceutical composition comprising a viral particle according to claim 
17, together with a pharmaceutically acceptable carrier or diluent. 

Please add the following new claim: 

-24. (New) A method of treating a viral infection, comprising administering to a 
subject infected with a virus an effective amount of a viral system according to claim l.~ 
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REMARKS 

Applicants respectfully request that the foregoing amendments to the claims be 
entered in order to avoid this application incurring a surcharge for the presence of one or 
more multiple dependent claims. 



Respectfully submitted, 



Date September 14. 2001 



By^ 




FOLEY &LARDNER 
Washington Harbour 
3000 K Street, N.W., Suite 500 
Washington, D.C. 20007-5109 
Telephone: (202) 672-5427 
Facsimile: (202) 672-5399 



Bemhard D. Saxe 
Attorney for Applicants 
Registration No. 28,665 
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MARKED UP VERSION TO SHOW CHANGES 

What Is Claimed Is : 

5. A system according to [any one of claims 1 to 4] claim 1 wherein the viral 
vector is a retroviral vector. 

8, A system according to [any one of claims 5 to 7] claim 5 wherein the 
polypeptide required for the assembly of viral particles is selected from gag, pol and env 
proteins. 

11. A system according to claim 9 [or 10] wherein the lenti virus is HIV. 

12. A system according to [any one of the preceding claims] claim 1 wherein the 
third nucleotide sequence is resistant to cleavage directed by the gene product as a result of 
one or more conservative alterations in the nucleotide sequence which remove cleavage sites 
recognised by the at least one gene product and/or binding sites for the at least one gene 
product. 

13. A system according to [any one of claims 1 to 1 1] claim 1 wherein the third 
nucleotide sequence is adapted to be resistant to cleavage by the at least one gene product. 

14. A system according to [any one of the preceding claims] claim 1 wherein the 
third nucleotide sequence is codon optimised for expression in producer cells. 

16. A system according to [any one of the preceding claims] claim 1 comprising a 
plurality of first nucleotide sequences and third nucleotide sequences as defined therein. 

17. A viral particle comprising a viral vector genome as defined in [any one of 
claims 3 to 16] claim 3 and one or more third nucleotide sequences as defined in [any of 
claims 3 to 16] claim 3 . 

18. A viral particle produced using a viral vector production system according to 
[any one of claims 3 to 16] claim 3 . 
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19. A method for producing a viral particle which method comprises introducing 
into a host cell (i) a viral genome as defined in [any one of claims 3 to 16] claim 3 (ii) one or 
more third nucleotide sequences as defined in [any of claims 3 to 16] claim 3 and (iii) 
nucleotide sequences encoding the other essential viral packaging components not encoded 
by the one or more third nucleotide sequences. 

21. A pharmaceutical composition comprising a viral particle according to [claims 
17, 18 or 20] claim 17 . together with a pharmaceutically acceptable carrier or diluent. 

Please add the following new claim; 

"24. (New) A method of treating a viral infection, comprising administering to a 
subject infected with a virus an effective amount of a viral system according to claim l.~ 
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SEOUENCE LISTING PART OF THE DESCRIPTION 



SEQ. ID. NO. 1 - Wild type gagpol sequence for strain HXB2 (accession no. K03455) 

ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAGAATTAG ATCGATGGGA AAAAATTCGG 6 0 
TTAAGGCCAG GGGGAAAGAA AAAATATAAA TTAAAACATA TAGTATGGGC AAGCAGGGAG 12 0 
CTAGAACGAT TCGCAGTTAA TCCTGGCCTG TTAGAAACAT CAGAAGGCTG TAGACAAATA 180 
CTGGGACAGC TACAACCATC CCTTCAGACA GGATCAGAAG AACTTAGATC ATTATATAAT 24 0 
ACAGTAGCAA CCCTCTATTG TGTGCATCAA AGGATAGAGA TAAAAGACAC CAAGGAAGCT 3 00 
TTAGACAAGA TAGAGGAAGA GCAAAACAAA AGTAAGAAAA AAGCACAGCA AGCAGCAGCT 3 60 
GACACAGGAC ACAGCAATCA GGTCAGCCAA AATTACCCTA TAGTGCAGAA CATCCAGGGG 420 
CAAATGGTAC ATCAGGCCAT ATCACCTAGA ACTTTAAATG CATGGGTAAA AGTAGTAGAA 48 0 
GAGAAGGCTT TCAGCCCAGA AGTGATACCC A't'GTTTTCAG CATTATCAGA AGGAGCCACC 54 0 
CCACAAGATT TAAACACCAT GCTAAACACA GTGGGGGGAC ATCAAGCAGC CATGCAAATG 600 
TTAAAAGAGA CCATCAATGA GGAAGCTGCA GAATGGGATA GAGTGCATCC AGTGCATGCA 66 0 
GGGCCTATTG CACCAGGCCA GATGAGAGAA CCAAGGGGAA GTGACATAGC AGGAACTACT 72 0 
AGTACCCTTC AGGAACAAAT AGGATGGATG ACAAATAATC CACCTATCCC AGTAGGAGAA 78 0 
ATTTATAAAA GATGGATAAT CCTGGGATTA AATAAAATAG TAAGAATGTA TAGCCCTACC 84 0 
AGCATTCTGG ACATAAGACA AGGACCAAAG GAACCCTTTA GAGACTATGT AGACCGGTTC 900 
TATAAAACTC TAAGAGCCGA GCAAGCTTCA CAGGAGGTAA AAAATTGGAT GACAGAAACC 960 
TTGTTGGTCC AAAATGCGAA CCCAGATTGT AAGACTATTT TAAAAGCATT GGGACCAGCG 102 0 
GCTACACTAG AAGAAATGAT GACAGCATGT CAGGGAGTAG GAGGACCCGG CCATAAGGCA 10 SO 
AGAGTTTTGG CTGAAGCAAT GAGCCAAGTA ACAAATTCAG CTACCATAAT GATGCAGAGA 114 0 
GGCAATTTTA GGAACCAAAG AAAGATTGTT AAGTGTTTCA ATTGTGGCAA AGAAGGGCAC 1200 
ACAGCCAGAA ATTGCAGGGC CCCTAGGAAA AAGGGCTGTT GGAAATGTGG AAAGGAAGGA 126 0 
CACCAAATGA AAGATTGTAC TGAGAGACAG GCTAATTTTT TAGGGAAGAT CTGGCCTTCC 1320 
TACAAGGGAA GGCCAGGGAA TTTTCTTCAG AGCAGACCAG AGCCAACAGC CCCACCAGAA 13 8 0 
GAGAGCTTCA GGTCTGGGGT AGAGACAACA ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 144 0 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGG TCACTCTTTG GCAACGACCC CTCGTCACAA 15 0 0 
TAAAGATAGG GGGGCAAGTA AAGGAAGCTC TATTAGATAC AGGAGCAGAT GATACAGTAT 1560 
TAGAAGAAAT GAGTTTGCCA GGAAGATGGA AACCAAAAAT GATAGGGGGA ATTGGAGGTT 1620 
TTATCAAAGT AAGACAGTAT GATCAGATAC TCATAGAAAT CTGTGGACAT AAAGCTATAG 16 8 0 
GTACAGTATT AGTAGGACCT ACACCTGTCA ACATAATTGG AAGAAATCTG TTGACTCAGA 174 0 
TTGGTTGCAC TTTAAATTTT CCCATTAGCC CTATTGAGAC TGTACCAGTA AAATTAAAGC 18 00 
CAGGAATGGA TGGCCCAAAA GTTAAACAAT GGCCATTGAC AGAAGAAAAA ATAAAAGCAT 18 6 0 
TAGTAGAAAT TTGTACAGAG ATGGAAAAGG AAGGGAAAAT TTCAAAAATT GGGCCTGAAA 1920 
ATCCATACAA TACTCCAGTA TTTGCCATAA AGAAAAAAGA CAGTACTAAA TGGAGAAAAT 1980 
TAGTAGATTT CAGAGAACTT AATAAGAGAA CTCAAGACTT CTGGGAAGTT CAATTAGGAA 2040 
TACCACATCC CGCAGGGTTA AAAAAGAAAA AATCAGTAAC AGTACTGGAT GTGGGTGATG 210 0 
CATATTTTTC AGTTCCCTTA GATGAAGACT TCAGGAAGTA TACTGCATTT ACCATACCTA 216 0 
GTATAAACAA TGAGACACCA GGQATTAGAT ATCAGTACAA TGTGCTTCCA CAGGGATGGA 2220 
AAGGATCACC AGCAATATTC CAAAGTAGCA TGACAAAAAT CTTAGAGCCT TTTAGAAAAC 22 8 0 
AAAATCCAGA CATAGTTATC TATCAATACA TGGATGATTT GTATGTAGGA TCTGACTTAG 234 0 
AAATAGGGCA GCATAGAACA AAAATAGAGG AGCTGAGACA ACATCTGTTG AGGTGGGGAC 24 00 
TTACCACACC AGACAAAAAA CATCAGAAAG AACCTCCATT CCTTTGGATG GGTTATGAAC 246 0 
TCCATCCTGA TAAATGGACA GTACAGCCTA TAGTGCTGCC AGAAAAAGAC AGCTGGACTG 252 0 
TCAATGACAT ACAGAAGTTA GTGGGGAAAT TGAATTGGGC AAGTCAGATT TACCCAGGGA 25 8 0 
TTAAAGTAAG GCAATTATGT AAACTCCTTA GAGGAACCAA AGCACTAACA GAAGTAATAC 264 0 
CACTAACAGA AGAAGCAGAG CTAGAACTGG CAGAAAACAG AGAGATTCTA AAAGAACCAG 2700 
TACATGGAGT GTATTATGAC CCATCAAAAG ACTTAATAGC AGAAATACAG AAGCAGGGGC 2 760 
AAGGCCAATG GACATATCAA ATTTATCAAG AGCCATTTAA AAATCTGAAA ACAGGAAAAT 2 820 
ATGCAAGAAT QAGGGGTGCC CACACTAATG ATGTAAAACA ATTAACAGAG GCAGTGCAAA 2 880 
AAATAACCAC AGAAAGCATA GTAATATGGG GAAAGACTCC TAAATTTAAA CTGCCCATAC 2 94 0 
AAAAGGAAAC ATGGGAAACA TGGTGGACAG AGTATTGGCA AGCCACCTGG ATTCCTGAGT 3 00 0 
GGGAGTTTGT TAATACCCCT CCCTTAGTGA AATTATGGTA CCAGTTAGAG AAAGAACCCA 3 060 
TAGTAGGAGC AGAAACCTTC TATGTAGATG GGGCAGCTAA CAGGGAGACT AAATTAGGAA 3120 
AAGCAGGATA TGTTACTAAT AGAGGAAGAC AAAA?IGTTGT CACCCTAACT GACACAACAA 318 0 
ATCAGAAGAG TGAGTTACAA GCAATTTATC TAGCTTTGCA GGATTCGGGA TTAGAAGTAA 3240 
ACATAGTAAC AGACTCACAA TATGCATTAG GAATCATTCA AGCACAACCA GATCA?IAGTG 33 00 
AATCAGAGTT AGTCAATCAA ATAATAGAGC AGTTAATAAA AAAGGAAAAG GTCTATCTGG 3 36 0 
CATGGGTACC AGCACACAAA GGAATTGGAG GAAATGAACA AGTAGATAAA TTAGTCAGTG 342 0 
CTGGAATCAG GAAAGTACTA TTTTTAGATG GAATAGATAA GGCCCAAGAT GAACATGAGA 34 8 0 
AATATCACAG TAATTGGAGA GCAATGGCTA GTGATTTTAA CCTGCCAGCT GTAGTAGCAA 3 540 
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AAGAAATAGT AGCCAGCTGT GATAAATGTC 
TAGACTGTAG TCCAGGAATA TGGCAACTAG 
TGGTAGCAGT TCATGTAGCC AGTGGATATA 
GGCAGGAAAC AGCATATTTT CTTTTAAAAT 
ATACTGACAA TGGCAGCAAT TTCACCGGTG 
GAATCAAGCA GGAATTTGGA ATTCCCTACA 
TGAATAAAGA ATTAAAGAAA ATTATAGGAC 
CAGCAGTACA AATGGCAGTA TTCATCCACA 
ACAGTGCAGG GGAAAGAATA GTAGACATAA 
AAAAACAAAT TACAAAAATT CAAAATTTTC 
TTTGGAAAGG ACCAGCAAAG CTCCTCTGGA 
ATAGTGACAT AAAAGTAGTG CCAAGAAGA^ 
AGATGGCAGG TGATGATTGT GTGGCAAGTA 



AGCTAAAAGG AGAAGCCATG CATGGACAAG 3 60 0 
ATTGTACACA TTTAGAAGGA AAAGTTATCC 3 6 60 
TAGAAGCAGA AGTTATTCCA GCAGAAACAG 3 72 0 
TAGCAGGAAG ATGGCCAGTA AAAACAATAC 3 780 
CTACGGTTAG GGCCGCCTGT TGGTGGGCGG 3 840 
ATCCCCAAAG TCAAGGAGTA GTAGAATCTA 3 900 
AGGTAAGAGA TCAGGCTGAA CATCTTAAGA 3 960 
ATTTTAAAAG AAAAGGGGGG ATTGGGGGGT 4020 
TAGCAACAGA CATACAAACT AAAGAATTAC 4 08 0 
GGGTTTATTA CAGGGACAGC AGAAATTCAC 414 0 
AAGGTGAAGG GGCAGTAGTA ATACAAGATA 42 00 
AAGCAAAGAT CATTAGGGAT TATGGAAAAC 4260 
GACAGGATGA GGATTAG 43 07 



SEQ LD. NO. 2 - gagpol-SYNgp - codon optimised gagpoi sequence 

ATGGGCGCCC GCGCCAGCGT GCTGTCGGGC GGCGAGCTGG ACCGCTGGGA GAAGATCCGC 60 
CTGCGCCCCG GCGGCAAAAA GAAGTACAAG CTGAAGCACA TCGTGTGGGC CAGCCGCGAA 120 
CTGGAGCGGT TCGCCGTGAA CCCCGGGCTC CTGGAGACCA GCGAGGGGTG CCGCCAGATC 180 
CTCGGCCAAC TGCAGCCCAG CCTGCAAACC GGCAGCGAGG AGCTGCGCAG CCTGTACAAC 24 0 
ACCGTGGCCA CGCTGTACTG CGTCCACCAG CGCATCGAAA TCAAGGATAC GAAAGAGGCC 3 00 
CTGGATAAAA TCGAAGAGGA ACAGAATAAG AGCAAAAAGA AGGCCCAACA GGCCGCCGCG 36 0 
GACACCGGAC ACAGCAACCA GGTCAGCCAG AACTACCCCA TCGTGCAGAA CATCCAGGGG 42 0 
CAGATGGTGC ACCAGGCCAT CTCCCCCCGC ACGCTGAACG CCTGGGTGAA GGTGGTGGAA 48 0 
GAGAAGGCTT TTAGCCCGGA GGTGATACCC ATGTTCTCAG CCCTGTCAGA GGGAGCCACC 54 0 
CCCCAAGATC TGAACACCAT GCTCAACACA GTGGGGGGAC ACCAGGCCGC CATGCAGATG 6 00 
CTGAAGGAGA CCATCAATGA GGAGGCTGCC GAATGGGATC GTGTGCATCC GGTGCACGCA 66 0 
GGGCCCATCG CACCGGQCCA GATGCGTGAG CCACGGGGCT CAGACATCGC CGGAACGACT 720 
AGTACCCTTC AGGAACAGAT CGGCTGGATG ACCAACAACC CACCCATCCC GGTGGGAGAA 760 
ATCTACAAAC GCTGGATCAT CCTGGGCCTG AACAAGATCG TGCGCATGTA TAGCCCTACC 840 
AGCATCCTGG ACATCCGCCA AGGCCCGAAG GAACCCTTTC GCGACTACGT GGACCGGTTC 900 
TACAAAACGC TCCGCGCCGA GCAGGCTAGC CAGGAGGTGA AGAACTGGAT GACCGAAACC 960 
CTGCTGGTCC AGAACGCGAA CCCGGACTGC AAGACGATCC TGAAGGCCCT GGGCCCAGCG 1020 
GCTACCCTAG AGGAAATGAT GACCGCCTGT CAGGGAGTGG GCGGACCCGG CCACAAGGCA 108 0 
CGCGTGCTGG CTGAGGCCAT GAGCCAGGTG ACCAACTCCG CTACCATCAT GATGCAGCGC 114 0 
GGCAACTTTC GGAACCAACG CAAGATCGTC AAGTGCTTCA ^ ACTGTGGCAA AGAAGGGCAC 12 0 0 
ACAGCCCGCA ACTGCAGGGC CCCTAGGAAA AAGGGCTGCT GGAAATGCGG CAAGGAAGGC 126 0 
CACCAGATGA AAGACTGTAC TGAGAGACAG GCTAATTTTT TAGGGAAGAT CTGGCCTTCC 1320 
TACAAGGGAA GGCCAGGGAA TTTTCTTCAG AGCAGACCAG AGCCAACAGC CCCACCAGAA 13 80 
GAGAGCTTCA GGTCTGGGGT AGAGACAACA ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 1440 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGA TCACTCTTTG GCAACGACCC CTCGTCACAA 150 0 
TAAAGATAGG GGGGCAGCTC AAGGAGGCTC TCCTGGACAC CGGAGCAGAC GACACCGTGC 156 0 
TGGAGGAGAT GTCGTTGCCA GGCCGCTGGA AGCCGAAGAT GATCGGGGGA ATCGGCGGTT 162 0 
TCATCAAGGT GCGCCAGTAT GACCAGATCC TCATCGAAAT CTGCGGCCAC AAGGCTATCG 163 0 
GTACCGTGCT GGTGGGCCCC ACACCCGTCA ACATCATCGG ACGCAACCTG TTGACGCAGA 174 0 
TCGGTTGCAC GCTGAACTTC CCCATTAGCC CTATCGAGAC GGTACCGGTG AAGCTGAAGC 1800 
CCGGGATGGA CGGCCCGAAG GTCAAGCAAT GGCCATTGAC AGAGGAGAAG ATCAAGGCAC IS 60 
TGGTGGAGAT TTGCACAGAG ATGGAAAAGG AAGGGAAAAT CTCCAAGATT GGGCCTGAGA 1920 
ACCCGTACAA CACGCCGGTG TTCGCAATCA AGAAGAAGGA CTCGACGAAA TGGCGCAAGC 19 8 0 
TGGTGGACTT CCGCGAGCTG AACAAGCGCA CGCAAGACTT CTGGGAGGTT CAGCTGGGCA 2 04 0 
TCCCGCACCC CGCAGGGCTG AAGAAGAAGA AATCCGTGAC CGTACTGGAT GTGGGTGATG 2100 
CCTACTTCTC CGTTCCCCTG GACGAAGACT TGAGGAAGTA CACTGCCTTC ACAATCCCTT 216 0 
CGATCAACAA CGAGACACCG GGGATTCGAT ATCAGTACAA CGTGCTGCCC CAGGGCTGGA 222 0 
AAGGCTCTCC CGCAATCTTC CAGAGTAGCA TGACCAAAAT CCTGGAGCCT TTCCGCAAAC 22 8 0 
AGAACCCCGA CATCGTCATC TATCAGTACA TGGATGACTT GTACGTGGGC TCTGATCTAG 234 0 
AGATAGGGCA GCACCGCACC AAGATCGAGG AGCTGCGCCA GCACCTGTTG AGGTGGGGAC 24 0 0 
TGACCACACC CGACAAGAAG CACCAGAAGG AGCCTCCCTT CCTCTGGATG GGTTACGAGC 24 60 
TGCACCCTGA CAAATGGACC GTGCAGCCTA TCGTGCTGCC AGAGAAAGAC AGCTGGACTG 2 52 0 
TCAACGACAT ACAGAAGCTG GTGGGGAAGT TGAACTGGGC CAGTCAGATT TACCCAGGGA 258 0 
TTAAGGTGAG GCAGCTGTGC AAACTCCTCC GCGGAACCAA GGCACTCACA GAGGTGATCC 2 640 
CCCTAACCGA GGAGGCCGAG CTCGAACTGG CAGAAAACCG AGAGATCCTA AAGGAGCCCG 27 00 
TGCACGGCGT GTACTATGAC CCCTCCAAGG ACCTGATCGC CGAGATCCAG AAGCAGGGGC 2760 
AAGGCCAGTG GACCTATCAG ATTTACCAGG AGCCCTTCAA GAACCTGAAG ACCGGCAAGT 2 820 
ACGCCCGGAT GAGGGGTGCC CACACTAACG ACGTCAAGCA GCTGACCGAG GCCGTGCAGA 2BB0 
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AGATCACCAC CGAAAGCATC GTGATCTGGG 
AGAAGGAAAC CTGGGAAACC TGGTGGACAG 
GGGAGTTCGT CAACACCCCT CCCCTGGTGA 
TAGTGGGCGC CGAAACCTTC TACGTGGATG 
AAGCCGGATA CGTCACTAAG CGGGGCAGAC 
ACCAGAAGAC TGAGCTGCAG GCCATTTACC 
ACATCGTGAC AGACTCTCAG' TATGCCCTGG 
AGTCCGAGCT GGTCAATCAG ATCATCGAGC 
CCTGGGTACC CGCCCACAAA GGCATTGGCG 
CTGGCATCAG GAAGGTGCTA TTCCTGGATG 
AATACCACAG CAACTGGCGG GCCATGGCTA 
AAGAGATCGT GGCCAGCTGT GACAAGTGTC 
TGGACTGTAG CCCCGGCATC TGGCAACTCG 
TGGTAGCCGT CCATGTGGCC AGTGGCTACA 
GGCAGGAGAC AGCCTACTTC CTCCTGAAGC 
ATACTGACAA TGGCAGCAAT TTCACCAGTG 
GAATCAAGCA GGAGTTCGGG ATCCCCTACA 
TGAATAAGGA GTTAAAGAAG ATTATCGGCC 
CCGCGGTCCA AATGGCGGTA TTCATCCACA 
ACAGTGCGGG GGAGCGGATC GTGGACATCA 
AAAAGCAGAT TACCAAGATT CAGAATTTCC 
TCTGGAAAGG CCCAGCGAAG CTCCTCTGGA 
ATAGCGACAT CAAGGTGGTG CCCAGAAGAA 
AGATGGCGGG TGATGATTGC GTGGCGAGCA 



GAAAGACTCC TAAGTTCAAG CTGCCCATCC 2 94 0 
AGTATTGGCA GGCCACCTGG ATTCCTGAGT 3 000 
AGCTGTGGTA CCAGCTGGAG AAGGAGCCCA 3 06 0 
GGGCCGCTAA CAGGGAGACT AAGCTGGGCA 312 0 
AGAAGGTTGT CACCCTCACT GACACCACCA 3180 
TGGCTTTGCA GGACTCGGGC CTGGAGGTGA 3 240 
GCATCATTCA AGCCCAGCCA GACCAGAGTG 33 00 
AGCTGATCAA GAAGGAAAAG GTCTATCTGG 3360 
GCAATGAGCA GGTCGACAAG CTGGTCTCGG 3420 
GCATCGACAA GGCCCAGGAC GAGCACGAGA 348 0 
GCGACTTCAA CCTGCCCCCT GTGGTGGCCA 3540 
AGCTCAAGGG CGAAGCCATG CATGGCCAGG 3600 
ATTGCACCCA TGTGGAGGGC AAGGTTATCC 3660 
TCGAGGCCGA GGTCATTCCC GCCGAAACAG 372 0 
TGGCAGGCCG GTGGCCAGTG AAGACGATCC 3 78 0 
CTACGGTTAA GGCCGCCTGC TGGTGGGCGG 3 84 0 
ATCCCCAGAG TCAGGGCGTC GTCGAGTCTA 3 900 
AGGTCAGAGA TCAGGCTGAG CATCTCAAGA 3 96 0 
ATTTCAAGCG GAAGGGGGGG ATTGGGGGGT 4020 
TCGCGACCGA CATCCAGACT AAGGAGCTGC 408 0 
GGGTCTACTA CAGGGACAGC AGAAATCCCC 4140 
AGGGTGAGGG GGCAGTAGTG ATCCAGGATA 42 00 
AGGCGAAGAT CATTAGGGAT TATGGCAAAC 4260 
GACAGGATGA GGATTAG 43 07 



SEQ. ID. NO. 3 - Envelope Gene from HIV-1 MN (Genbank accession no. Mi7449) 

ATGAGAGTGA AGGGGATCAG GAGGAATTAT CAGCACTGGT GGGGATGGGG CACGATGCTC 60 
CTTGGGTTAT TAATGATCTG TAGTGCTACA GAAAAATTGT GGGTCACAGT CTATTATGGG X20 
GTACCTGTGT GGAAAGAAGC AACCACCACT CTATTTTGTG CATCAGATGC TAAAGCATAT 180 
GATACAGAGG TACATAATGT TTGGGCCACA CAAGCCTGTG TACCCACAGA CCCCAACCCA 24 0 
CAAGAAGTAG AATTGGTAAA TGTGACAGAA AATTTTAACA TGTGGAAAAA TAACATGGTA 3 00 
GAACAGATGC ATGAGGATAT AATCAGTTTA TGGGATCAAA GCCTAAAGCC ATGTGTAAAA 360 
TTAACCCCAC TCTGTGTTAC TTTAAATTGC ACTGATTTGA GGAATACTAC TAATACCAAT 420 
AATAGTACTG CTAATAACAA TAGTAATAGC GAGGGAACAA TAAAGGGAGG AGAAATGAAA 480 
AACTGCTCTT TCAATATCAC aACAAGCATA AGAGATAAGA TGCAGAAAGA ATATGCACTT 54 0 
CTTTATAAAC TTGATATAGT ATCAATAGAT AATGATAGTA CCAGCTATAG GTTGATAAGT 60 0 
TGTAATACCT CAGTCATTAC ACAAGCTTGT CCAAAGATAT CCTTTGAGCC AATTCCCATA 660 
CACTATTGTG CCCCGGCTGG TTTTGCGATT CTAAAATGTA ACGATAAAAA GTTCAGTGGA 72 0 
AAAGGATCAT GTAAAAATGT CAGCACAGTA CAATGTACAC ATGGAATTAG GCCAGTAGTA 780 
TCAACTCAAC TGCTGTTAAA TGGCAGTCTA GCAGAAGAAG AGGTAGTAAT TAGATCTGAG 84 0 
AATTTCACTG ATAATGCTAA AACCATCATA GTACATCTGA ATGAATCTGT ACAAATTAAT 900 
TGTACAAGAC CCAACTACAA TAAAAGAAAA AGGATACATA TAGGACCAGG GAGAGCATTT 96 0 
TATACAACAA AAAATATAAT AGGAACTATA AGACAAGCAC ATTGTAACAT TAGTAGAGCA 1020 
AAATGGAATG ACACTTTAAG ACAGATAGTT AGCAAATTAA AAGAACAATT TAAGAATAAA 10 8 0 
AGAATAGTCT TTAATCAATC CTCAGGAGGG GACCCAGAAA TTGTAATGCA CAGTTTTAAT 1140 
TGTGGAGGGG AATTTTTCTA CTGTAATACA TCACCACTGT TTAATAGTAC TTGGAATGGT 1200 
AATAATACTT GGAATAATAC TACAGGGTCA AATAACAATA TCACACTTCA. ATGCAAAATA 126 0 
AAACAAATTA TAAACATGTG GCAGGAAGTA GGAAAAGCAA TGTATGCCCC TCCCATTGAA 1320 
GGACAAATTA GATGTTCATC AAATATTACA GGGCTACTAT TAACAAGAGA TGGTGGTAAG 138 0 
GACACQGACA CGAACGACAC CGAGATCTTC AGACCTGGAG GAGGAGATAT GAGGGACAAT 144 0 
TGGAGAAGTG AATTATATAA ATATAAAGTA GTAACAATTG AACCATTAGG AGTAGCACCC 150 0 
ACCAAGGCAA AGAGAAGAGT GGTGCAGAGA GAAAAAAGAG CAGCGATAGG AGCTCTGTTC 1560 
CTTGGGTTCT TAGGAGCAGC AGGAAGCACT ATGGGCGCAG CGTCAGTGAC GCTGACGGTA 162 0 
CAGGCCAGAC TATTATTGTC TGGTATAGTG CAACAGCAGA ACAATTTGCT GAGGGCCATT 168 0 
GAGGCGCAAC AGCATATGTT GCAACTCACA GTCTGGGGCA TCAAGCAGCT CCAGGCAAGA 174 0 
GTCCTGGCTG TGGAAAGATA CCTAAAGGAT CAACAGCTCC TGGGGTTTTG GGGTTGCTCT 18 00 
GGAAAACTCA TTTGCACCAC TACTGTGCCT TGGAATGCTA GTTGGAGTAA TAAATCTCTG 186 0 
GATGATATTT GGAATAACAT GACCTGGATG CAGTGGGAAA GAGAAATTGA CAATTACACA 192 0 
AGCTTAATAT ACTCATTACT AGAAAAATCG CAAACCCAAC AAGAAAAGAA TGAACAAGAA 198 0 
TTATTGGAAT TGGATAAATG GGCAAGTTTG TGGAATTGGT TTGACATAAC AAATTGGCTG 2040 
TGGTATATAA AAATATTCAT AATGATAGTA GGAGGCTTGG TAGGTTTAAG AATAGTTTTT 2100 
GCTGTACTTT CTATAGTGAA TAGAGTTAGG CAGGGATACT CACCATTGTG GTTGCAGACC 216 0 
CGCCCCCCAG TTCCGAGGGG ACCCGACAGG CCCGAAGGAA TCGAAGAAGA AGGTGGAGAG 222 0 
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AGAGACAGAG ACACATCCGG TCGATTAGTG CATGGATTCT TAGCAATTAT CTGGGTCGAC 22 SO 
CTGCGGAGCC TGTTCCTCTT CAGCTACCA.C CACAGAGACT TACTCTTGAT TGCAGCGAGG 234 0 
ATTGTGGAAC TTCTGGGACG CAGGGGGTGG GAAGTCCTCA AATATTGGTG GAATCTCCTA 240 0 
CAGTATTGGA GTCAGGAACT AAAGAGTAGT GCTGTTAGCT TGCTTAATGC CACAGCTATA 246 0 
GCAGTAGCTG AGGGGACAGA TAGGGTTATA GAAGTACTGC AAAGAGCTGG TAGAGCTATT 252 0 
CTCCACATAC CTACAAGAAT AAGACAGGGC TTGGAAAGGG CTTTGCTATA A 2 571 



SEQ. I.D. NO. 4 - SYNgp-I60mn - codon optimised env sequence 

ATGAGGGTGA AGGGGATCCG CCGCAACTAC CAGCACTGGT GGGGCTGGGG CACGATGCTC 6 0 
CTGGGGCTGC TGATGATCTG CAGCGCCACC GAGAAGCTGT GGGTGACCGT GTACTACGGC 12 0 
GTGCCCGTGT GGAAGGAGGC CACCACCACC CTGTTCTGCG CCAGCGACGC CAAGGCGTAC 180 
GACACCGAGG TGCACAACGT GTGGGCCACC CAGGCGTGCG TGCCCACCGA CCCCAACCCC 24 0 
CAGGAGGTGG AGCTCGTGAA CGTGACCGAG AACTTCAACA TGTGGAAGAA CAACATGGTG 3 00 
GAGCAGATGC ATGAGGACAT CATCAGCCTG TGGGACCAGA GCCTGAAGCC CTGCGTGAAG 360 
CTGACCCCCC TGTGCGTGAC CCTGAACTGC ACCGACCTGA GGAACACCAC CAACACCAAC 420 
AACAGCACCG CCAACAACAA CAGCAACAGC GAGGGCACCA TCAAGGGCGG CGAGATGAAG 4 SO 
AACTGCAGCT TCAACATCAC CACCAGCATC CGCGACAAGA TGCAGAAGGA GTACGCCCTG 540 
CTGTACAAGC TGGATATCGT GAGCATCGAC AACGACAGCA CCAGCTACCG CCTGATCTCC 600 
TGCAACACCA GCGTGATCAC CCAGGCCTGC CCCAAGATCA GCTTCGAGCC CATCCCCATC 660 
CACTACTGCG CCCCCGCCGG CTTCGCCATC CTGAAGTGCA ACGACAAGAA GTTCAGCGGC 720 
AAGGGCAGCT GCAAGAACGT GAGCACCGTG CAGTGCACCC AGGGGATCCG GCCGGTGGTG 780 
AGCACCCAGC TCCTGCTGAA GGGCAGCCTG GCCGAGGAGG AGGTGGTGAT CCGCAGCGAG 84 0 
AACTTCACCG ACAACGCCAA GACCATCATC GTGCACCTGA ATGAGAGCGT GCAGATCAAC 900 
TGCACGCGTC CCAACTACAA CAAGCGCAAG CGCATCCACA TCGGCCCCGG GCGCGCCTTC 960 
TACACCACCA AGAACATCAT CGGCACCATC CGCCAGGCCC ACTGCAACAT CTCTAGAGCC 102 0 
AAGTGGAACG ACACCCTGCG CCAGATCGTG AGCAAGCTGA AGGAGCAGTT CAAGAACAAG 10 8 0 
ACCATCGTGT TCAACCAGAG CAGCGGCGGC GACCCCGAGA TCGTGATGCA CAGCTTCAAC 114 0 
TGCGGCGGCG AATTCTTCTA CTGCAACACC AGCCCCCTGT TCAACAGCAC CTGGAACGGC 12 00 
AACAACACCT GGAACAACAC CACCGGCAGC AACAACAATA TTACCCTCCA GTGCAAGATC 1260 
AAGCAGATCA TCAACATGTG GCAGGAGGTG GGCAAGGCCA TGTACGCGCC CCCCATCGAG 13 20 
GGC CAGATCC GGTGCAGCAG CAACATCACC GGTCTGCTGC TGACCCGCGA CGGCGGCAAG 13 80 
GACACCGACA CCAACGACAC CGAAATCTTC CGCCCCGGCG GCGGCGACAT GCGCGACAAC 1440 
TGGAGATCTG AGCTGTACAA GTACAAGGTG GTGACGATCG AGCCCCTGGG CGTGGCCCCC 1500 
ACCAAGGCCA AGCGCCGCGT GGTGCAGCGC GAGAAGCGGG CCGCCATCGG CGCCCTGTTC 156 0 
CTGGGCTTCC TGGGGGCGGC GGGCAGCACC ATGGGGGCCG CCAGCGTGAC CCTGACCGTG 1620 
CAGGCCCGCC TGCTCCTGAG CGGCATCGTG CAGCAGGAGA ACAACCTCCT CCGCGCCATC 1680 
GAGGCCCAGC AGCATATGCT CCAGCTCACC GTGTGGGGCA TCAAGCAGCT CCAGGCCCGC 1740 
GTGCTGGCCG TGGAGCGCTA CCTGAAGGAC CAGCAGCTCC TGGGCTTCTG GGGCTGCTCC 18 0 0 
GGCAAGCTGA TCTGCACCAC CACGGTACCC TGGAACGCCT CCTGGAGCAA CAAGAGCCTG 18 6 0 
GACGACATCT GGAACAACAT GACCTGGATG CAGTGGGAGC GCGAGATCGA TAACTACACC 192 0 
AGCCTGATCT ACAGCCTGCT GGAGAAGAGC CAGACCCAGC AGGAGAAGAA CGAGCAGGAG 198 0 
CTGCTGGAGC TGGACAAGTG GGCGAGCCTG TGGAACTGGT TCGACATCAC CAACTGGCTG 204 0 
TGGTACATCA AAATCTTCAT CATGATTGTG GGCGGCCTGG TGGGCCTCCG CATCGTGTTC 210 0 
GCCGTGCTGA GCATCGTGAA CCGCGTGCGC CAGGGCTACA GCCCCCTGAG CCTCCAGACC 216 0 
CGGCCCCCCG TGCCGCGCGG GCCCGACCGC CCCGAGGGCA TCGAGGAGGA GGGCGGCGAG 222 0 
CGCGACCGCG ACACCAGCGG CAGGCTCGTG CACGGCTTCC TGGCGATCAT CTGGGTCGAC 22 BO 
CTCCGCAGCC TGTTCCTGTT CAGCTACCAC CACCGCGACC TGCTGCTGAT CGCCGCCCGC 23 4 0 
ATCGTGGAAC TCCTAGGCCG CCGCGGCTGG GAGGTGCTGA AGTACTGGTG GAACCTCCTC 24 00 
CAGTATTGGA GCCAGGAGCT GAAGTCCAGC GCCGTGAGCC TGCTGAACGC CACCGCCATC 24 6 0 
GCCGTGGCCG AGGGCACCGA CCGCGTGATC GAGGTGCTCC AGAGGGCCGG GAGGGCGATC 252 0 
CTGCACATCC CCACCCGCAT CCGCCAGGGG CTCGAGAGGG CGCTGCTGTA A 25 71 



SEQ, I.D. NO. 1 1 - Complete Sequence of pH4DOZENEGS 

CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG CGCAGCGTGA 60 
CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC TTTCTTCCCT TCCTTTCTCG 120 
CCACGTTCGC CGGCTTTCCC CGTCAAGCTC TAAATCGGGG GCTCCCTTTA GGGTTCCGAT 18 0 
TTAGTGCTTT ACGGCACCTC GACCCCAAAA AACTTGATTA GGGTGATGGT TCACGTAGTG 2 40 
GGCCATCGCC CTGATAGACG GTTTTTCGCC CTTTGACGTT GGAGTCCACG TTCTTTAATA 3 00 
GTGGACTCTT GTTCCAAACT GGAACAACAC TCAACCCTAT CTCGGTCTAT TCTTTTGATT 36 0 
TATAAGGGAT TTTGCCGATT TCGGCCTATT GGTTAAAAAA TGAGCTGATT TAACAAAAAT 42 0 
TTAACGCGAA TTTTAACAAA ATATTAACGC TTACAATTTC CATTCGCCAT TCAGGCTGCG 4 80 
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CAACTGTTGG GAAGGGCGAT CGGTGCGGGC CTCTTCGCTA TTACGCCAGC TGGCGAAAGG 54 0 
GGGATGTGCT GCAAGGCGAT TAAGTTGGGT AACGCCAGGG TTTTCCCAGT CACGACGTTG SOO 
TAAAACGACG GCCAGTGAGC GCGCGTAATA CGACTCACTA TAGGGCGAAT TGGAGCTCCA 66 0 
CCGCGGTGGC GGCCGCTCTA GAGTCCGTTA CATAACTTAC GGTAAATGGC CCGCCTGGCT 72 0 
GACCGCCCAA CGACCCCCGC CCATTGACGT CAATAATGAC GTATGTTCCC ATAGTAACGC 78 0 
CAATAGGGAC TTTCCATTGA CGTCAATGGG TGGAGTATTT ACGGTAAACT GCCCACTTGG 840 
CAGTACATCA AGTGTATCAT ATGCCAAGTA CGCCCCCTAT TGACGTCAAT GACGGTAAAT 900 
GGCCCGCCTG GCATTATGCC CAGTACATGA CCTTATGGGA CTTTCCTACT TGGCAGTACA 960 
TCTACGTATT AGTCATCGCT ATTACCATGG TGATGCGGTT TTGGCAGTAC ATCAATGGGC 102 0 
GTGGATAGCG GTTTGACTCA CGGGGATTTC CAAGTCTCCA CCCCATTGAC GTCAATGGGA 1080 
GTTTGTTTTG GCACCAAAAT CAACGGGACT TTCCAAAATG TCGTAACAAC TCCGCCCCAT 1140 
TGACGCAAAT GGGCGGTAGG CGTGTACGGT GGGAGGTCTA TATAAGCAGA GCTCGTTTAG 1200 
TGAACCGQTC TCTCTGGTTA GACCAGATCT GAGCCTGGGA GCTCTCTGGC TAACTAGGGA 126 0 
ACCCACTGCT TAAGCCTCAA TAAAGCTTGC CTTGAGTGCT TCAAGTAGTG TGTGCCCGTC 1320 
TGTTGTGTGA CTCTGGTAAC TAGAGATCCC TCAGACCCTT TTAGTCAGTG TGGAAAATCT 1380 
CTAGCAGTGG CGCCCGAACA GGGACTTGAA AGCGAAAGGG AAACCAGAGG AGCTCTCTCG 1440 
ACGCAGGACT CGGCTTGCTG AAGCGCGCAC GGCJlAGAGGC GAGGGGCGGC GACTGGTGAG 1500 
TACGCCAAAA ATTTTGACTA GCGGAGGCTA GAAGGAGAGA GATGGGTGCG AGAGCGTCAG 1560 
TATTAAGCGG GGGAGAATTA GATCGCGATG GGAAAAAATT CGGTTAAGGC CAGGGGGAAA 162 0 
GAAAAAATAT AAATTAAAAC ATATAGTATG GGCAAGCAGG GAGCTAGAAC GATTCGCAGT 168 0 
TAATCCTGGC CTGTTAGAAA CATCAGAAGG CTGTAGACAA ATACTGGGAC AGCTACAACC 174 0 
ATCCCTTCAG ACAGGATCAG AAGAACTTAG ATCATTATAT AATACAGTAG CAACCCTCTA 13 00 
TTGTGTGCAT CAAAGGTTGA GATAAAAGAC ACCAAGGAAG CTTTAGACAA GATAGAGGGA 186 0 
GAGCAAAACA AAAGTAAGAA AAAAGCACAG CAAGCAGCAG CTGACACAGG ACACAGCAAT 192 0 
CAGGTCAGCC AAAATTACCC TATAGTGCAG TU^CATCCAGG GGCAAATGGT ACATCAGGCC 198 0 
ATATCACCTA GAACTTTAAA TGCATGGGTA AAAGTAGTAG AAGAGAAGGC TTTCAGCCCA 2 04 0 
GAAGTGATAC CCATGTTTTC AGCATTATCA GAAGGAGCCA CCCCACAAGA TTTAAACACC 210 0 
ATGCTAAACA CAGTGGGGGG ACATCAAGCA GCCATGCAAA TGTTAAAAGA GACCATCAAT 216 0 
GAGGAAGCTG CAGGAATTCG CCTAAAACTG CTTGTACCAA TTGCTATTGT AAAAAGTGTT 222 0 
GCTTTCATTG CCAAGTTTGT TTCATAACAA AAGCCTTAGG CATCTCCTAT GGCAGGAAGA 228 0 
AGCGGAGACA GCGACGAAGA GCTCATCAGA ACAGTCAGAC TCATCAAGCT TCTCTATCAA 234 0 
AGCAGTAAGT AGTACATGTA ACGCAACCTA TACCAATAGT AGCAATAGTA GCATTAGTAG 240 0 
TAGCAATAAT AATAGCAATA GTTGTGTGGT CCATAGTAAT CATAGAATAT AGGAAAATAT 246 0 
TAAGACAAAG AAAAATAGAC AGGTTAATTG ATAGACTAAT AGAAAGAGCA GAAGACAGTG 252 0 
GCAATGAGAG TGAAGGAGAA ATATCAGCAC TTGTGGAGAT GGGGGTGGAG ATGGGGCACC 2580 
ATGCTCCTTG GGATGTTGAT GATCTGTAGT GCTACAGAAA AATTGTGGGT CACAGTCTAT 264 0 
TATGGGGTAC CTGTGTGGAA GGAAGCAACC ACCACTCTAT TTTGTGCATC AGATGCTAAA 2700 
GCATAGATCT TCAGACTTGG AGGAGGAGAT ATGAGGGACA ATTGGAGAAG TGAATTATAT 2760 
AAATATAAAG TAGTAAAAAT TGAACCATTA GGAGTAGCAC CCACCAAGGC AAAGAGAAGA 282 0 
GTGGTGCAGA GAGAAAAAAG AGCAGTGGGA ATAGGAGCTT TGTTCCTTGG GTTCTTGGGA 288 0 
GCAGCAGGAA GCACTATGGG CGCAGCGTCA ATGACGCTGA CGGTACAGGC CAGACAATTA 2 94 0 
TTGTCTGGTA TAGTGCAGCA GCAGAACAAT TTGCTGAGGG CTATTGAGGC GCAACAGCAT 3000 
CTGTTGCAAC TCACAGTCTG GGGCATCAAG CAGCTCCAGG CAAGAATCCT GGCTGTGGAA 3060 
AGATACCTAA AGGATCAACA GCTCCTGGGG ATTTGGGGTT GCTCTGGAAA ACTCATTTGC 3120 
ACCACTGCTG TGCCTTGGAA TGCTAGTTGG AGTAATAAAT CTCTGGAACA GATCTGGAAT 3180 
CACACGACCT GGATGGAGTG GGACAGAGAA ATTAACAATT ACACAAGCTT AATACACTCC 3240 
TTAATTGAAG AATCGCAAAA CCAGCAAGAA AAGAATGAAC AAGAATTATT GGAATTAGAT 3 3 00 
AAATGGGCAA GTTTGTGGAA TTGGTTTAAC ATAACAAATT GGCTGTGGTA TATAAAATTA 3360 
TTCATAATGA TAGTAGGAGG CTTGGTAGGT TTAAGAATAG TTTTTGCTGT ACTTTCTATA 3420 
GTGAATAGAG TTAGGCAGGG ATATTCACCA TTATCGTTTC AGACCCACCT CCCAACCCCG 3480 
AGGGGACCCG ACAGGCCCGA AGGAATAGAA GAAGAAGGTG GAGAGAGAGA CAGAGACAGA 3 540 
TCCATTCGAT TAGTGAACGG ATCCTTGGCA CTTATCTGGG ACGATCTGCG GAGCCTGTGC 36 00 
CTCTTCAGCT ACCACCGCTT GAGAGACTTA CTCTTGATTG TAACGAGGAT TGTGGAACTT 3660 
CTGGGACGCA GGGGGTGGGA AGCCCTCAAA TATTGGTGGA ATCTCCTACA GTATTGGAGT 3720 
CAGGAACTAA AGAATAGTGC TGTTAGCTTG CTCAATGCCA CAGCCATAGC AGTAGCTGAG 3 780 
GGGACAGATA GGGTTATAGA AGTAGTACAA GGAGCTTGTA GAGCTATTCG CCACATACCT 3 840 
AGAAGAATAA GACAGGGCTT GGAAAGGATT TTGCTATAAG ATGGGTGGCA AGTGGTCAAA 3900 
AAGTAGTGTG ATTGGATGGC CTACTGTAAG GGAAAGAATG AGACGAGCTG AGCCAGCAGC 3 960 
AGATAGGGTG GGAGCAGCAT CTCGACGCTG CAGGAGTGGG GAGGCACGAT GGCCGCTTTG 40 20 
GTCGAGGCGG ATCCGGCCAT TAGCCATATT ATTCATXGGT TATATAGCAT AAATCAATAT 4 030 
TGGCTATTGG CCATTGCATA CGTTGTATCC ATATCATAAT ATGTACATTT ATATTGGCTC 4140 
ATGTCCAACA TTACCGCCAT GTTGACATTG ATTATTGACT AGTTATTAAT AGTAATCAAT 42 00 
TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC TTACGGTAAA 4260 
TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT 4320 
TCCCATAGTA ACGCCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT ATTTACGGTA 4380 
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AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTACGCCCC CTATTGACGT 444 0 
CAATGACGGT AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAT GGGACTTTCC 4 500 
TACTTGGCAG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA 4 560 
GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT 462 0 
TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA AATGTCGTAA 4 68 0 
CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCATGTA CGGTGGGAGG TCTATATAAG 4 74 0 
CAGAGCTCGT TTAGTGAACC GTCAGATCGC CTGGAGACGC CATCCACGCT GTTTTGACCT 4 8 00 
CCATAGAAGA CACCGGGACC GATCCAGCCT CCGCGGCCCC AAGCTTCAGC TGCTCGAGCC 486 0 
CGGGGATGAC GTCATCGACT TCGAAGGTTC GAATCCTTCT ACTGCCACCA TTTTTTCTCT 4 920 
ACGTCATCGA CTTCGAAGGT TCGAATCCTT CCCTGTCCAC CAGTCGAGTA TTACGTCATC 4 98 0 
GACTTCGAAG GTTCGAATCC TTCTAGATTC ACCATTTTTT AGGAACGTCA TCGACTTCGA 504 0 
AGGTTCGAAT CCTTCCAGTT CCACCAGTCG AGGCCAACGT CATCGACTTC GAAGGTTCGA 510 0 
ATCCTTCTCT TCCCACCATT TTTTTTCCAC GTCATCGACT TCGAAGGTTC GAATCCTTCG 516 0 
GGGCCCACCA GTCGAGGGCT ACGTCATCGA CTTCGAAGGT TCGAATCCTT CTTGCTTCAC 522 0 
CATTTTTTCT GAACGTCATC GACTTCGAAG GTTCGAATCC TTCTGCTGTC ACCAGTCGAG 52 8 0 
TATAACGTCA TCGACTTCGA AGGTTCGAAT CCTTGACCGG TCACCATTTT TTTATAACGT 534 0 
CATCGACTTC GAAGGTTCGA ATCCTTCTTC TTACACCAQT CGAGGTACAC GTCATCGACT 540 0 
TCGAAGGTTC GAATCCTTCG TAGTTCACCA TTTTTTGTGC ACGTCATCGA CTTCGAAGGT 5460 
TCGAATCCTT CTAGGCCCAC CAGTCGACGC ATGCCTGCAG GTCGAGGTCG ATACCGTCGA 552 0 
GACCTAGAAA AACATGGAGC AATCACAAGT AGCAATACAG CAGCTACCAA TGCTGATTGT 55 8 0 
GCCTGGCTAG AAGCACAAGA GGAGGAGGAG GTGGGTTTTC CAGTCACACC TCAGGTACCT 564 0 
TTAAGACCAA TGACTTACAA GGCAGCTGTA GATCTTAGCC ACTTTTTAAA AGAAAAGGGG 57 0 0 
GGACTGGAAG GGCTAATTCA CTCCCAACGA AGACAAGATA TCCTTGATCT GTGGATCTAC 5 76 0 
CACACACAAG GCTACTTCCC TGATTGGCAG AACTACACAC CAGGGCCAGG GATCAGATAT 582 0 
CCACTGACCT TTGGATGGTG CTACAAGCTA GTACCAGTTG AGCAAGAGAA GGTAGAAGAA 58 8 0 
GCCAATGAAG GAGAGAACAC CCGCTTGTTA CACCCTGTGA GCCTGCATGG GATGGATGAC 594 0 
CCGGAGAGAG AAGTATTAGA GTGGAGGTTT GACAGCCGCC TAGCATTTCA TCACATGGCC 600 0 
CGAGAGCTGC ATCCGGAGTA CTTCAAGAAC TGCTGACATC GAGCTTGCTA CAAGGGACTT 6 06 0 
TCCGCTGGGG ACTTTCCAGG GAGGCGTGGC CTGGGCGGGA CTGGGGAGTG GCGAGCCCTC 612 0 
AGATGCTGCA TATAAGCAGC TGCTTTTTGC CTGTACTGGG TCTCTCTGGT TAGACCAGAT 618 0 
CTGAGCCTGG GAGCTCTCTG GCTAACTAGG GAACCCACTG CTTAAGCCTC AATAAAGCTT 6240 
GCCTTGAGTG CTTCAAGTAG TGTGTGCCCG TCTGTTGTGT GACTCTGGTA ACTAGAGATC 63 00 
CCTCAGACCC TTTTAGTCAG TGTGGAAAAT CTCTAGCAGT CGAGGGGGGG CCCGGTACCC 63 6 0 
AGCTTTTGTT CCCTTTAGTG AGGGTTAATT GCGCGCTTGG CGTAATCATG GTCATAGCTG 642 0 
TTTCCTGTGT GAAATTGTTA TCCGCTCACA ATTCCACACA ACATACGAGC CGGAAGCATA 64 8 0 
AAGTGTAAAG CCTGGGGTGC CTAATGAGTG AGCTAACTCA CATTAATTGC GTTGCGCTCA 6540 
CTGCCCGCTT TCCAGTCGGG AAACCTGTCG TGCCAGCTGC ATTAATGAAT CGGCCAACGC 6 6 00 
GCGGGGAGAG GCGGTTTGCG TATTGGGCGC TCTTCCGCTT CCTCGCTCAC TGACTCGCTG 666 0 
CGCTCGGTCG TTCGGCTGCG GCGAGCGGTA TCAGCTCACT CAAAGGCGGT AATACGGTTA 672 0 
TCCACAGAAT CAGGGGATAA CGCAGGAAAG AACATGTGAG CAAAAGGCCA GCAAAAGGCC 67 8 0 
AGGAACCGTA AAAAGGCCGC GTTGCTGGCG TTTTTCGATA GGCTCCGCCC CCCTGACGAG 684 0 
CATCACAAAA ATCGACGCTC AAGTCAGAGG TGGCGAAACC CGACAGGACT ATAAAGATAC 690 0 
CAGGCGTTTC CCCCTGGAAG CTCCCTCGTG CGCTCTCCTG TTCCGACCCT GCCGCTTACC 696 0 
GGATACCTGT CCGCCTTTCT CCCTTCGGGA AGCGTGGCGC TTTCTCATAG CTCACGCTGT 7 02 0 
AGGTATCTCA GTTCGGTGTA GGTCGTTCGC TCCAAGCTGG GCTGTGTGCA CGAACCCCCC 7 08 0 
GTTCAGCCCG ACCGCTGCGC CTTATCCGGT AACTATCGTC TTGAGTCCAA CCCGGTAAGA 714 0 
CACGACTTAT CGCCACTGGC AGCAGCCACT GGTAACAGGA TTAGCAGAGC GAGGTATGTA 72 00 
GGCGGTGCTA CAGAGTTCTT GAAGTGGTGG CCTAACTACG GCTACACTAG AAGGACAGTA 72 6 0 
TTTGGTATCT GCGCTCTGCT GAAGCCAGTT ACCTTCGGAA AAAGAGTTGG TAGCTCTTGA 732 0 
TCCGGCAAAC AAACCACCGC TGGTAGCGGT GGTTTTTTTG TTTGCAAGCA GCAGATTACG 73 8 0 
CGCAGAAAAA AAGGATCTCA AGAAGATCCT TTGATCTTTT CTACGGGGTC TGACGCTCAG 744 0 
TGGAACGAAA ACTCACGTTA AGGGATTTTG GTCATGAGAT TATCAAAAAG GATCTTCACC 7 50 0 
TAGATCCTTT TAAATTAAAA ATGAAGTTTT AAATCAATCT AAAGTATATA TGAGTAAACT 75 6 0 
TGGTCTGACA GTTACCAATG CTTAATCAGT GAGGCACCTA TCTCAGCGAT CTGTCTATTT 76 2 0 
CGTTCATCCA TAGTTGCCTG AGTCCCCGTC GTGTAGATAA CTACGATACG GGAGGGCTTA 768 0 
CCATCTGGCC CCAGTGCTGC AATGATACCG CGAGACCCAC GCTCACCGGC TCCAGATTTA 774 0 
TCAGCAATAA ACCAGCCAGC CGGAAGGGCC GAGCGCAGAA GTGGTCCTGC AACTTTATCC 78 0 0 
GCCTCCATCC AGTCTATTAA TTGTTGCCGG GAAGCTAGAG TAAGTAGTTC GCCAGTTAAT 7 86 0 
AGTTTGCGCA ACGTTGTTGC CATTGCTACA GOCATCGTGG TGTCACGCTC GTCGTTTGGT 7920 
ATGGCTTCAT TCAGCTCCGG TTCCCAACGA TCAAGGCGAG TTACATGATC CCCCATGTTG 7 98 0 
TGCAAAAAAG CGGTTAGCTC CTTCGGTCCT CCGATCGTTG TCAGAAGTAA GTTGGCGGCA 8 04 0 
GTGTTATCAC TCATGGTTAT GGCAGCACTG CATAATTCTC TTACTGTCAT GCCATCCGTA 8100 
AGATGCTTTT CTGTGACTGG TGAGTACTCA ACCAAGTCAT TCTGAGAATA GTGTATGCGG 8160 
CGACCGAGTT GCTCTTGCCC GGCGTCAATA CGGGATAATA CCGCGCCACA TAGCAGAACT 822 0 
TTAAAAGTGC TCATCATTGG AAAACGTTGT TCGGGGCGAA AACTCTCAAG GATCTTACCG 8 23 0 
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CTGTTGAGAT CCAGTTCGAT GTAACCCACT 
AGTTTCACCA GCGTTTCTGG GTGAGCAAAA 
ATAAGGGCGA CACGGAAATG TTGAATACTC 
ATTTATCAGG GTTATTGTCT CATGAGCGGA 
CAAATAGGGG TTCCGCGCAC ATTTCCCCGA 



CGTGCACCCA ACTGATCTTC AGCATCTTTT 8 34 0 

ACAGGAAGGC Ai\AATGCCGC AA?iAAAGGGA 840 0 

ATACTCTTCC TTTTTCAATA TTATTGAAGC 846 0 

TACATATTTG AATGTATTTA GAAAAATAAA 8 52 0 

AAAGTGCCAC 8560 



SEQ. LD. NO. 12 - pSYNGP2 - codon optimised HTV-l gagpol with leader sequence 

1 GGGTCTCTCT GGTTAGACCA GATCTGAGCC TGGGAGCTCT CTGGCTAACT AGGGAACCCA 
61 CTGCTTAAGC CTCAATAAAG CTTGCCTTGA GTGCTTCAAG TAGTGTGTGC CCGTCTGTTG 
121 TGTGACTCTG GTAACTAGAG ATCCCTCAGA CCCTTTTAGT CAGTGTGGAA AATCTCTAGC 
181 AGTGGCGCCC GAACAGGGAC CTGAAAGCGA AAGGGAAACC AGAGCTCTCT CGACGCAGGA 
241 CTCGGCTTGC TGAAGCGCCC GCACGGCAAG AGGCGAGGGG CGGCGACTGG TGAGTACGCC 
3 01 AAAAATTTTG ACTAGCGGAG GCTAGAAGGA GAGAGATGGG CGCCCGCGCC AGCGTGCTGT 
361 CGGGCGGCGA GCTGGACCGC TGGGAGAAGA TCCGCCTGCG CCCCGGCGGC AAAAAGAAGT 
421 ACAAGCTGAA GCACATCGTG TGGGCCAGCC GCGAACTGGA GCGCTTCGCC GTGAACCCCG 
481 GGCTCCTGGA GACCAGCGAG GGGTGCCGCC AGATCCTCGG CCAACTGCAG CCCAGCCTGC 
541 AAACCGGCAG CGAGGAGCTG CGCAGCCTGT ACAACACCGT GGCCACGCTG TACTGCGTCC 
601 ACCAGCGCAT CGAAATCAAG GATACGAAAG AGGCCCTGGA TAAAATCGAA GAGGAACAGA 
661 ATAAGAGCAA AAAGAAGGCC CAACAGGCCG CCGCGGACAC CGGACACAGC AACCAGGTCA 
721 GCCAGAACTA CCCCATCGTG CAGAACATCC AGGGGCAGAT GGTGCACCAG GCCATCTCCC 
7 81 CCCGCACGCT GAACGCCTGG GTGAAGGTGG TGGAAGAGAA GGCTTTTAGC CCGGAGGTGA 
841 TACCCATGTT CTCAGCCCTG TCAGAGGGAG CCACCCCCCA AGATCTGAAC ACCATGCTCA 
901 ACACAGTGGG GGGACACCAG GCCGCCATGC AGATGCTGAA GGAGACCATC AATGAGGAGG 
961 CTGCCGAATG GGATCGTGTG CATCCGGTGC ACGCAGGGCC CATCGCACCG GGCCAGATGC 
1021 GTGAGCCACG GGGCTCAGAC ATCGCCGGAA CGACTAGTAC CCTTCAGGAA CAGATCGGCT 
10 81 GGATGACCAA CAACCCACCC ATCCCGGTGG GAGAAATCTA CAAACGCTGG ATCATCCTGG 
1141 GCCTGAACAA GATCGTGCGC ATGTATAGCC CTACCAGCAT CCTGGACATC CGCCAAGGCC 

12 01 CGAAGGAACC CTTTCGCGAC TACGTGGACC GGTTCTACAA AACGCTCCGC GCCGAGCAGG 
1261 CTAGCCAGGA GGTGAAGAAC TGGATGACCG AAACCCTGCT GGTCCAGAAC GCGAACCCGG 
1321 ACTGCAAGAC GATCCTGAAG GCCCTGGGCC CAGCGGCTAC CCTAGAGGAA ATGATGACCG 

13 81 CCTGTCAGGG AGTGGGCGGA CCCGGCCACA AGGCACGCGT CCTGGCTGAG GCCATGAGCC 
1441 AGGTGACCAA CTCCGCTACC ATCATGATGC AGCGCGGCAA CTTTCGGAAC CAACGCAAGA 
1501 TCGTCAAGTG CTTCAACTGT GGCAAAGAAG GGCACACAGC CCGCAACTGC AGGGCCCCTA 
1561 GGAAAAAGGG CTGTTGGAAA TGTGGAAAGG AAGGACACCA AATGAAAGAT TGTACTGAGA 
1621 GACAGGCTAA TTTTTTAGGG AAGATCTGGC CTTCCCACAA GGGAAGGCCA GGGAATTTTC 
16 81 TTCAGAGCAG ACCAGAGCCA ACAGCCCCAC CAGAAGAGAG CTTCAGGTTT GGGGAAGAGA 
1741 CAACAACTCC CTCTCAGAAG CAGGAGCCGA TAGACAASGA ACTGTATCCT TTAGCTTCCC 
18 01 TCAGATCACT CTTTGGCAGC GACCCCTCGT CACAATAAAG ATAGGGGGGC AGCTCAAGGA 
1861 GGCTCTCCTG GACACCGGAG CAGACGACAC CGTGCTGGAG GAGATGTCGT TGCCAGGCCG 
1921 CTGGAAGCCG AAGATGATCG GGGGAATCGG CGGTTTCATC AAGGTGCGCC AGTATGACCA 
1981 GATCCTCATC GAAATCTGCG GCCACAAGGC TATCGGTACC GTGCTGGTGG GCCCCACACC 
2041 CGTCAACATC ATCGGACGCA ACCTGTTGAC GCAGATCGGT TGCACGCTGA ACTTCCCCAT 
2101 TAGCCCTATC GAGACGGTAC CGGTGAAGCT GAAGCCCGGG ATGGACGGCC CGAAGGTCAA 
2161 GCAAlTGGCCA TTGACAGAGG AGAAGATCAA GGCACTGGTG GAGATTTGCA CAGAGATGGA 
2221 AAAGGAAGGG AAAATCTCCA AGATTGGGCC TGAGAACCCG tacaacacgc cggtgttcgc 
22 81 aatcaagaag aaggactcga cgaaatggcg caagctggtg gacttccgcg agctgaacaa 

2341 GCGCACGCAA GACTTCTGGG AGGTTCAGCT GGGCATCCCG CACCCCGCAG GGCTGAAGAA 
2401 GAAGAAATCC GTGACCGTAC TGGATGTGGG TGATGCCTAC TTCTCCGTTC CCCTGGACGA 
2461 AGACTTCAGG AAGTACACTG CCTTCACAAT GCCTTCGATC AACAACGAGA CACCGGGGAT 
2521 TCGATATCAG TACAACGTGC TGCCCCAGGG CTGGAAAGGC TCTCCCGCAA TCTTCCAGAG 
2 581 TAGCATGACC AAAATCCTGG AGCCTTTCCG CAAACAGAAC CCCGACATCG TCATCTATCA 
2641 GTACATGGAT GACTTGTACG TGGGCTCTGA TCTAGAGATA GGGCAGCACC GCACCAAGAT 
2 7 01 CGAGGAGCTG CGCCAGCACC TGTTGAGGTG GGGACTGACC ACACCCGACA AGAAGCACCA 
2 7 61 GAAGQAGCCT CCCTTCCTCT GGATGGGTTA CGAGCTGCAC CCTGACAAAT GGACCGTGCA 
2821 GCCTATCGTG CTGCCAGAGA AAGACAGCTG GACTGTCAAC GACATACAGA AGCTGGTGGG 
28 81 GAAGTTGAAC TGGGCCAGTC AGATTTACCC AGGGATTAAG GTGAGGCAGC TGTGCAAACT 

2 941 CCTCCGCGGA ACCAAGGCAC TCACAGAGGT GATCCCCCTA ACCGAGGAGG CCGAGCTCGA 

3 0 01 ACTGGCAGAA AACCGAGAGA TCCTAAAGGA GCCCGTGCAC GGCGTGTACT ATGACCCCTC 
3 061 CAAGGACCTG ATCGCCGAGA TCCAGAAGCA GGGGCAAGGC CAGTGGACCT ATCAGATTTA 
3121 CCAGGAGCCC TTCAAGAACC TGAAGACCGG CAAGTACGCC CGGATGAGGG GTGCCCACAC 
3181 TAACGACGTC AAGCAGCTGA CCGAGGCCGT GCAGAAGATC ACCACCGAAA GCATCGTGAT 
3 241 CTGGGGAAAG ACTCCTAAGT TCAAGCTGCC CATCCAGAAG GAAACCTGGG AAACCTGGTG 
3 3 01 GACAGAGTAT TGGCAGGCCA CCTGGATTCC TGAGTGGGAG TTCGTCAACA CCCCTCCCCT 
3361 GGTGAAGCTG TGGTACCAGC TGGAGAAGGA GCCCATAGTG GGCGCCGAAA CCTTCTACGT 
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3421 GGATGGGGCC GCTAACAGGG AGACTAAGCT 
3481 CAGACAGAAG GTTGTCACCC TCACTGACAC 
3 541 TTACCTCGCT TTGCAGGACT CGGGCCTGGA 
3 601 CCTGGGCATC ATTCAAGCCC AGCCAGACCA 
3 661 CGAGCAGCTG ATCAAGAAGG AAAAGGTCTA 
3 721 TGGCGGCAAT GAGCAGGTCG ACAAGCTGGT 
3 781 GGATGGCATC GACAAGGCCC AGGACGAGCA 
3 841 GGCTAGCGAC TTCAACCTGC CCCCTGTGGT 
3 901 GTGTCAGCTC AAGGGCGAAG CCATGCATGG 

3 961 ACTCGATTGC ACCCATCTGG AGGGCAAGGT 
4021 CTACATCGAG GCCGAGGTCA TTCCCGCCGA 
40 81 GAAGCTGGCA GGCCGGTGGC CAGTGAAGAC 
4141 CAGTGCTACG GTTAAGGCCG CCTGCTGGTG 

42 01 CTACAATCCC CAGAGTCAGG GCGTCGTCGA 
4261 CGGCCAGGTC AGAGATCAGG CTGAGCATCT 
4321 CCACAATTTC AAGCGGAAGG GGGGGATTGG 

43 81 CATCATCGCG ACCGACATCC AGACTAAGGA 
4441 TTTCCGGGTC TACTACAGGG ACAGCAGAAA 
4501 CTGGAAGGGT GAGGGGGCAG TAGTGATCCA 
4561 AAGAAAGGCG AAGATCATTA GGGATTATGG 

4 621 GAGCAGACAG GATGAGGATT AG 



GGGCAAAGCC GGATACGTCA CTAACCGGGG 
CACCAACCAG AAGACTGAGC TGCAGGCCAT 
GGTGAACATC GTGACAGACT CTCAGTATGC 
GAGTGAGTCC GAGCTGGTCA ATGAGATCAT 
TCTGGCCTGG GTACCCGCCC ACAAAGGCAT 
CTCGGCTGGC ATCAGGAAGG TGCTATTCCT 
CGAGAAATAC CACAGCAACT GGCGGGCCAT 
GGCCAAAGAG ATCGTGGCCA GCTGTGACAA 
CCAGGTGGAC TGTAGCCCCG GCATCTGGCA 
TATCCTGGTA GCCGTCCATG TGGCCAGTGG 
AACAGGGCAG GAGACAGCCT ACTTCCTCCT 
CATCCATACT GACAATGGCA GCAATTTCAC 
GGCGGGAATC AAGCAGGAGT TCGGGATCCC 
GTCTATGAAT AAGGAGTTAA AGAAGATTAT 
CAAGACCGCG GTCCAAATGG CGGTATTCAT 
GGGGTACAGT GCGGGGGAGC GGATCGTGGA 
GCTGCAASAG CAGATTACCA AGATTCAGAA 
TCCCCTCTGG AAAGGCCCAG CGAAGCTCCT 
GGATAATAGC GACATCAAGG TGGTGCCCAG 
CAAACAGATG GCGGGTGATG ATTGCGTGGC 



SEQ. LD. NO. 13 - pSYNGP3 - codon optimised HTV-l gagpol with leader sequence from 
the major splice donor 

1 GTGAGTACGC CAAAAATTTT GACTAGCGGA GGCTAGAAGG AGAGAGATGG GCGCCCGCGC 
61 CAGCGTGCTG TCGGGCGGCG AGCTGGACCG CTGGGAGAAG ATCCGCCTGC GCCCCGGCGG 
121 CAAAAAGAAG TACAAGCTGA AGCACATCGT GTGGGCCAGC CGCGAACTGG AGCGCTTCGC 
181 CGTGAACCCC GGGCTCCTGG AGACCAGCGA GGGGTGCCGC CAGATCCTCG GCCAACTGCA 
241 GCCCAGCCTG CAAACCGGCA GCGAGGAGCT GCGCAGCCTG TACAACACCG TGGCCACGCT 
3 01 GTACTGCGTC CACCAGCGCA TCGAAATCAA GGATACGAAA GAGGCCCTGG ATAAAATCGA 
3 51 AGAGGAACAG AATAAGAGCA AAAAGAAGGC CCAACAGGCC GCCGCGGACA CCGGACACAG 
421 CAACCAGGTC AGCCAGAACT ACCCCATCGT GCAGAACATC CAGGGGCAGA TGGTGCACCA 
481 GGCCATCTCC CCCCGCACGC TGAACGCCTG GGTGAAGGTG GTGGAAGAGA AGGCTTTTAG 
541 CCCGGAGGTG ATACCCATGT TCTCAGCCCT GTCAGAGGGA GCCACCCCCC AAGATCTGAA 
6 01 CACCATGCTC AACACAGTGG GGGGACACCA GGCCGGCATG CAGATGCTGA AGGAGACGAT 
661 CAATGAGGAG GCTGCCGAAT GGGATCGTGT GCATCCGGTG CACGCAGGGC CCATCGCACC 
721 GGGCCAGATG CGTGAGCCAC GGGGCTCAGA CATCGCCGGA ACGACTAGTA CCCTTCAGGA 
781 ACAGATCGGC TGGATGACCA ACAACCCACC CATCCCGGTG GGAGAAATCT ACAAACGCTG 
841 GATCATCCTG GGCCTGAACA AGATCGTGCG CATGTATAGC CCTACCAGCA TCCTGGACAT 
9 01 CCGCCAAGGC CCGAAGGAAC CCTTTCGCGA CTACGTGGAC CGGTTCTACA AAACGCTCCG 
961 CGCCGAGCAG GCTAGCCAGG AGGTGAAGAA CTGGATGACC GAAACCCTGC TGGTCCAGAA 
1021 CGCGAACCCG GACTGCAAGA CGATCCTGAA GGCCCTGGGC CCAGCGGCTA CCCTAGAGGA 
1081 AATGATGACC GCCTGTGAGG GAGTGGGCGG ACCCGGCCAC AAGGCACGCG TCCTGGCTGA 
1141 GGCCATGAGC CAGGTGACCA ACTCCGCTAC CATCAXGATG CAGCGCGGCA ACTTTCGGAA 

12 01 CCAACGCAAG ATCGTCAAGT GCTTCAACTG TGGCAAAGAA GGGCACACAG CCCGCAACTG 
1261 CAGGGCCCCT AGGAAAAAGG GCTGTTGGAA ATGTGGAAAG GAAGGACACC AAATGAAAGA 

13 21 TTGTACTGAG AGACAGGCTA ATTTTTTAGG GAAGATCTGG CCTTCCCACA AGGGAAGGCC 
1381 AGGGAATTTT CTTCAGAGCA GACCAGAGCC AACAGCCCCA CCAGAAGAGA GCTTCAGGTT 
1441 TGGGGAAGAG ACAACAACTC CCTCTCAGAA GCAGGAGCCG ATAGACAAGG AACTGTATCC 
15 01 TTTAGCTTCC CTCAGATCAC TCTTTGGCAG CGACCCCTCG TCACAATAAA GATAGGGGGG 

15 61 CAGCTCAAGG AGGCTCTCCT GGACACCGGA GCAGACGACA CCGTGCTGGA GGAGATGTCG 
1621 TTGCCAGGCC GCTGGAAGCC GAAGATGATC GGGGGAATCG GCGGTTTCAT CAAGGTGCGC 

16 81 CAGTATGACC AGATCCTCAT CGAAATCTGC GGCCACAAGG CTATCGGTAC CGTGCTGGTG 
1741 GGCCCCACAC CCGTCAACAT CATCGGACGC AACCTGTTGA CGCAGATCGG TTGCACGCTG 
18 01 AACTTCCCCA TTAGCCCTAT CGAGACGGTA CCGGTGAAGC TGAAGCCCGG GATGGACGGC 
1861 CCGAAGGTCA AGCAATGGCC ATTGACAGAG GAGAAGATCA AGGCACTGGT GGAGATTTGC 
1921 AGAGAGATGG AAAAGGAAGG GAAAATCTCC AAGATTGGGC CTGAGAACCC GTACAACACG 
1981 CCGGTGTTCG CAATCAAGAA GAAGGACTCG ACGAAATGGC GCAAGCTGGT GGACTTCCGC 
2041 GAGCTGAACA AGCGCACGCA AGACTTCTGG GAGGTTCAGC TGGGCATCCC GCACCCCGCA 
2101 GGGCTGAAGA AGAAGAAATC CGTGACCGTA CTGGATGTGG GTGATGCCTA CTTCTCCGTT 
2161 CCCCTGGACG AAGACTTCAG GAAGTACACT GCCTTCACAA TCCCTTCGAT CAACAACGAG 
2221 ACACCGGGGA TTCGATATCA GTACAACGTG CTGCCCCAGG GCTGGAAAGG CTCTCCCGCA 
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22 81 ATCTTCCAGA GTAGCATGAC CAAAATCCTG GAGCCTTTCC GCAAACAGAA CCCCGACATC 
2341 GTCATCTATC AGTACATGGA TGACTTGTAC GTGGGCTCTG ATCTAGAGAT AGGGCAGCAC 
2401 CGCACCAAGA TCGAGGAGCT GCGCCAGCAC CTGTTGAGGT GGGGACTGAC CACACCCGAC 
2 461 AAGAAGCACC AGAAGGAGGC TCCCTTCCTC TGGATGGGTT ACGAGCTGCA CCCTGACAAA 
2521 TGGACCGTGC AGCCTATCGT GCTGCCAGAG AAAGACAGCT GGACTGTCAA CGACATACAG 
2 581 AAGCTGGTGG GGAAGTTGAA CTGGGCCAGT CAGATTTACC CAGGGATTAA GGTGAGGCAG 
2641 CTGTGCAAAC TCCTCCGCGG AACCAAGGCA CTCACAGAGG TGATCCCCCT AACCGAGGAG 
2 7 01 GCCGAGCTCG AACTGGCAGA AAACCGAGAG ATCCTAAAGG AGCCCGTGCA CGGCGTGTAC 
2761 TATGACCCCT CCAAGGACCT GATCGCCGAG ATCCAGAAGC AGGGGCAAGG CCAGTGGACC 
2821 TATCAGATTT ACCAGGAGCC CTTCAAGAAC CTGAAGACCG GCAAGTACGC CCGGATGAGG 
28 81 GGTGCCCACA CTAACGACGT CAAGCAGCTG ACCGAGGCCG TGCAGAAGAT CACCACCGAA 

2 941 AGCATGGTGA TCTGGGGAAA GACTCCTAAG TTCAAGCTGC CCATCCAGAA GGAAACCTGG 

3 001 GAAACCTGGT GGACAGAGTA TTGGCAGGCC ACCTGGATTC CTGAGTGGGA GTTCGTCAAC 
3 061 ACCCCTCCCC TGGTGAAGCT GTGGTACCAG CTGGAGAAGG AGCCCATAGT GGGCGCCGAA 
3121 ACCTTCTACG TGGATGGGGC CGCTAACAGG GAGACTAAGC TGGGCAAAGC CGGATACGTC 
3181 ACTAACCGGG GCAGACAGAA GGTTGTCACC CTCACTGACA CCACCAACCA GAAGACTGAG 
3241 CTGCAGGCCA TTTACCTCGC TTTGCAGGAC TCGGGCCTGG AGGTGAACAT CGTGACAGAC 

33 01 TCTCAGTATG CCCTGGGCAT CATTCAAGCC CAGCCAGACC AGAGTGAGTC CGAGCTGGTC 
3361 AATCAGATCA TCGAGCAGCT GATCAAGAAG GAAAAGGTCT ATCTGGGCTG GGTACCCGCC 
3421 CACAAAGGCA TTGGCGGCAA TGAGCAGGTC GACAAGCTGG TCTCGGCTGG CATCAGGAAG 

34 81 GTGCTATTCC TGGATGGCAT CGACAAGGCC CAGGACGAGC ACGAGAAATA CCACAGCAAC 
3 541 TGGCGGGCCA TGGCTAGCGA CTTCAACCTG CCCCCTGTGG TGGCCAAAGA GATCGTGGCC 
36 01 AGCTGTGACA AGTGTCAGCT CAAGGGCGAA GCCATGCATG GCCAGGTGGA CTGTAGCCCC 
3 661 GGCATCTGGC AACTCGATTG CACCCATCTG GAGGGCAAGG TTATCCTGGT AGCCGTCCAT 
3 721 GTGGCCAGTG GCTACATCGA GGCCGAGGTC ATTCCCGCCG AAACAGGGCA GGAGACAGCC 
3 781 TACTTCCTCC TGAAGCTGGC AGGCCGGTGG CCAGTGAAGA CCATGCATAC TGACAATGGC 
3 841 AGCAATTTCA CCAGTGCTAC GGTTAAGGCC GCCTGCTGGT GGGCGGGAAT CAAGCAGGAG 
3 901 TTCGGGATCC CCTACAATCC CCAGAGTCAG GGCGTCGTCG AGTCTATGAA TAAGGAGTTA 

3 961 AAGAAGATTA TCGGCCAGGT CAGAGATCAG GCTGAGCATC TCAAGACCGC GGTCCAAATG 

4 021 GCGGTATTCA TCCACAATTT CAAGCGGAAG GGGGGGATTG GGGGGTACAG TGCGGGGGAG 
40 81 CGGATCGTGG ACATCATCGC GACCGACATC CAGACTAAGG AGCTGCAAAA GCAGATTACC 
4141 AAGATTCAGA ATTTCCGGGT CTACTACAGG GACAGCAGAA ATCCCCTCTG GAAAGGCCCA 
42 01 GCGAAGCTCC TCTGGAAGGG TGAGGGGGCA GTAGTGATCC AGGATAATAG CGACATCAAG 
4261 GTGGTGCCCA GAAGAAAGGC GAAGATCATT AGGGATTATG GCAAACAGAT GGCGGGTGAT 
4321 GATTGCGTGG CGAGCAGACA GGATGAGGAT TAG 

SEQ. LD. NO. 14 - pSYNGP4 - codon optimised HTV-l gagpol with 20 bp of the leader 
sequence of HIV-l , upstream of the start codon of ATG. 

1 CGGAGGCTAG AAGGAGAGAG ATGGGCGCCC GCGCCAGCGT GCTGTCGGGC GGCGAGCTGG 
61 ACCGCTGGGA GAAGATCCGC CTGCGCCCCG GCGGCAAAAA GAAGTACAAG GTGAAGCACA 
121 TCGTGTGGGC CAGCCGCGAA CTGGAGCGCT TCGCCGTGAA CCCCGGGCTC CTGGAGACCA 
181 GCGAGGGGTG CCGCCAGATG CTCGGCCAAC TGCAGCCCAG CCTGCAAACC GGCAGCGAGG 

2 41 AGCTGCGCAG CCTGTACAAC ACCGTGGCCA CGCTGTACTG CGTCCACGAG CGCATCGAAA 

3 01 TCAAGGATAC GAAAGAGGCC CTGGATAAAA TCGAAGAGGA ACAGAATAAG AGGAAAAAGA 
361 AGGCCCAACA GGCCGCCGCG GACACCGGAC ACAGCAACCA GGTCAGCCAG AACTACCCCA 
421 TCGTGCAGAA CATCCAGGGG CAGATGGTGC ACCAGGCCAT CTCCCCCCGC ACGCTGAACG 
481 CCTGGGTGAA GGTGGTGGAA GAGAAGGCTT TTAGCCCGGA GGTGATACCC ATGTTCTCAG 
541 CCCTGTGAGA GGGAQCCACC CCCCAAGATC TGAACACCAT GCTCAACACA GTGGGGGGAC 
6 01 ACCAGGCCGC CATGCAGATG CTGAAGGAGA CCATCAATGA GGAGGCTGCC GAATGGGATC 
6 61 GTGTGGATCC GGTGCACGCA GGGCCCATCG CACCGGGCCA GATGCGTGAG CCACGGGGCT 
721 CAGACATCGC CGGAACGACT AGTACCCTTC AGGAACAGAT CGGCTGGATG ACCAACAACC 
781 CACCCATCCC GGTGGGAGAA ATCTACAAAC GCTGGATCAT CCTGGGCCTG AACAAGATCG 
841 TGCGCATGTA TAGCCCTACC AGCATCCTGG AGATCCGCCA AGGCCCGAAG GAACCCTTTC 
901 GCGACTACGT GGACCGGTTC TACAAAACGC TCCGCGCCGA GCAGGCTAGC CAGGAGGTGA 
961 AGAACTGGAT GACCGAAACC CTGCTGGTGC AGAACGCGAA CCCGGACTGC AAGACGATCC 

1021 TGAAGGCCCT GGGCCCAGCG GCTACCCTAG AGGAAATGAT GACCGCCTGT CAGGGAGTGG 
10 81 GCGGACCCGG CCACAAGGCA CGCGTCCTGG CTGAGGCCAT GAGCCAGGTG ACGAACTCCG 
1141 CTACCATCA^ GATGCAGCGC GGCAACTTTC GGAACCAACG CAAGATCGTC AAGTGCTTCA 
1201 ACTGTGGCAA AGAAGGGCAC ACAGCCCGCA ACTGCAGGGC GCCTAGGAAA AAGGGCTGTT 

12 61 GGAAATGTGG AAAGGAAGGA CACCAAATGA AAGATTGTAC TGAGAGACAG GCTAATTTTT 

13 21 TAGGGAAGAT CTGGCCTTCC CACAAGGGAA GGCCAGGGAA TTTTCTTCAG AGCAGACCAG 
13 81 AGCCAACAGC CCCACCAGAA GAGAGCTTCA GGTTTGGGGA AGAGACAACA ACTCCCTCTC 
1441 AGAAGCAGGA GCCGATAGAC AAGGAACTGT ATCCTTTAGC TTCCCTCAGA TCACTCTTTG 
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1501 GCAGCGACCC CTCGTCACAA TAAAGATAGG GGGGCAGCTC AAGGAGGCTC TCCTGGACAC 
1561 CGGAGCAGAC GACACCGTGC TGGAGGAGAT GTCGTTGCCA GGCCGCTGGA AGCCGAAGAT 
1621 GATCGGGGGA ATCGGCGGTT TCATCAAGGT GCGCCAGTAT GACCAGATCC TCATCGAAAT 
16 81 CTGCGGCCAC AAGGCTATCG GTACCGTGCT GGTGGGCCCC ACACCCGTCA ACATCATCGG 
1741 AGGCAACCTG TTGACGCAGA TCGGTTGCAC GCTGAACTTC CCCATTAGCC CTATCGAGAC 
18 01 GGTACCGGTG AAGCTGAAGC CCGGGATGGA CGGCCCGAAG GTCAAGCAAT GGCCATTGAC 
1861 AGAGGAGAAG ATCAAGGCAC TGGTGGAGAT TTGCACAGAG ATGGAAAAGG AAGGGAAAAT 
1921 CTCCAAGATT GGGCCTGAGA ACCCGTACAA CACGCCGGTG TTCGCAATCA AGAAGAAGGA 
1981 CTCGACGAAA TGGCGCAAGC TGGTGGACTT CCGCGAGCTG AACAAGCGCA CGCAAGACTT 
2 041 CTGGGAGGTT CAGCTGGGCA TCCCGCACCC CGCAGGGCTG AAGAAGAAGA AATCCGTGAC 
2101 CGTACTGGAT GTGGGTGATG CCTACTTCTC CGTTCCCCTG GACGAAGACT TCAGGAAGTA 
2161 CACTGCCTTC ACAATCCCTT CGATCAACAA CGAGACACCG GGGATTCGAT ATCAGTACAA 
2221 CGTGCTGCCC CAGGGCTGGA AAGGCTCTCC CGCAATCTTC CAGAGTAGCA TGACCAAAAT 
2 2 81 CCTGGAGCCT TTCCGCAAAC AGAACCCCGA CATCGTCATC TATCAGTACA TGGATGACTT 
2341 GTACGTGGGC TCTGATCTAG AGATAGGGCA GCACCGCACC AAGATCGAGG AGCTGCGCCA 
24 01 GCACCTGTTG AGGTGGGGAC TGACCACACC CGACAAGAAG CACCAGAAGG AGCCTCCCTT 
2461 CCTCTGGATG GGTTACGAGC TGCACCCTGA CAAATGGACC GTGCAGCCTA TCGTGCTGCC 
2 521 AGAGAAAGAC AGCTGGACTG TCAACGACAT ACAGAAGCTG GTGGGGAAGT TGAACTGGGC 
2 5 81 CAGTCAGATT TACCCAGGGA TTAAGGTGAG GCAGCTGTGC AAACTCCTCC GCGGAACCAA 
2 641 GGCACTCACA GAGGTGATCC CCCTAACCGA GGAGGCCGAG CTCGAACTGG CAGAAAACCG 
2701 AGAGATCCTA AAGGAGCCCG TGCACGGCGT GTACTATGAC CCCTCCAAGG ACCTGATCGC 
2 761 CGAGATCCAG AAGCAGGGGC AAGGCCAGTG GACCTATCAG ATTTACCAGG AGCCCTTCAA 
2 821 GAACCTGAAG ACCGGCAAGT ACGCCCGGAT GAGGGGTGCC CACACTAACG ACGTCAAGCA 
2 8 81 GCTGACCGAG GCCGTGCAGA AGATCACCAC CGAAAGCATC GTGATCTGGG GAAAGACTCC 

2 941 TAAGTTCAAG CTGCCCATCC AGAAGGAAAC CTGGGAAACC TGGTGGACAG AGTATTGGCA 

3 001 GGCCACCTGG ATTCCTGAGT GGGAGTTCGT CAACACCCCT CCCCTGGTGA AGCTGTGGTA 
3 061 CCAGCTGGAG AAGGAGCCCA TAGTGGGCGC CGAAACCTTC TACGTGGATG GGGCCGCTAA 
3121 CAGGGAGACT AAGCTGGGCA AAGCCGGATA CGTCACTAAC CGGGGCAGAC AGAAGGTTGT 
3181 CACCCTCACT GACACCACCA ACCAGAAGAC TGAGCTGCAG GCCATTTACC TCGCTTTGCA 
3 241 GGACTCGGGC CTGGAGGTGA ACATCGTGAC AGACTCTCAG TATGCCCTGG GCATCATTCA 
3301 AGCCCAGCCA GACCAGAGTG AGTCCGAGCT GGTCAATCAG ATCATCGAGC AGCTGATCAA 
3 361 GAAGGAAAAG GTCTATCTGG CCTGGGTACC CGCCCACAAA GGCATTGGCG GCAATGAGGA 
3421 GGTCGACAAG CTGGTCTCGG CTGGCATCAG GAAGGTGCTA TTCCTGGATG GCATCGACAA 
3481 GGCCCAGGAC GAGCACGAGA AATACCACAG CAACTGGCGG GCCATGGCTA GCGACTTCAA 
3 541 CCTGCCCCCT GTGGTGGCCA AAGAGATCGT GGCCAGCTGT GACAAGTGTC AGCTCAAGGG 
3 601 CGAAGCCATG CATGGCCAGG TGGACTGTAG CCCCGGCATC TGGCAACTCG ATTGCACCCA 
3 661 TCTGGAGGGC AAGGTTATCC TGGTAGCCGT CCATGTGGCC AGTGGCTACA TCGAGGCCGA 
3 721 GGTCATTCCC GCCGAAACAG GGCAGGAGAC AGCCTACTTC CTCCTGAAGC TGGCAGGCCG 
3 781 GTGGCCAGTG AAGACCATCC ATACTGACAA TGGCAGCAAT TTCACCAGTG CTACGGTTAA 
3 841 GGCCGCCTGC TGGTGGGCGG GAATCAAGCA GGAGTTCGGG ATCCCCTACA ATCCCCAGAG 
3 9 01 TCAGGGCGTC GTCGAGTCTA TGAATAAGGA GTTAAAGAAG ATTATCGGCC AGGTCAGAGA 
3 961 TCAGGCTGAG CATCTCAAGA CCGCGGTCCA AATGGCGGTA TTCATCCACA ATTTCAAGCG 
4021 GAAGGGGGGG ATTGGGGGGT ACAGTGCGGG GGAGCGGATC GTGGACATCA TCGCGACCGA 
4081 CATCCAGACT AAGGAGCTGC AAAAGCAGAT TACCAAGATT CAGAATTTCC GGGTCTACTA 
4141 CAGGGACAGC AGAAATCCCC TCTGGAAAGG CCCAGCGAAG CTCCTCTGGA AGGGTGAGGG 
4201 GGCAGTAGTG ATCCAGGATA ATAGCGACAT CAAGGTGGTG CCCAGAAGAA AGGCGAAGAT 
4261 CATTAGGGAT TATGGCAAAC AGATGGCGGG TGATGATTGC GTGGCGAGCA GACAGGATGA 
4321 GGATTAG 
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ANTI-VIRAL VECTORS 



Field of the Invention 

The present invention relates to novel viral vectors capable of delivering anti-viral 
inhibitory RNA molecnies to target cells. 

Background to the Invention 

The application of gene therapy to the treatment of AIDS and HIV infection has been 
discussed widely (Lever, 1995). The types of therapeutic gene proposed usually fall into 
one of two broad categories. In the first the gene encodes protein products that inhibit the 
virus in a number of possible ways. One example of such a protein is the RevMlO 
derivative of the HIV Rev protein. The RevMlO protein acts as a transdominant negative 
mutant and so competitively inhibits Rev function in the virus. Like many of the protein- 
based strategies, the RevMlO protein is a derivative of a native HIV protein. While this 
provides the basis for the anti-HTV effect, it also has serious disadvantages. In particular, 
this type of strategy demands that in the absence of the virus there is little or no expression 
of the gene. Otherwise, healthy cells harbouring the gene become a target for the host 
cytotoxic T lymphocyte (CTL) system, which recognises the foreign protein. The second 
broad category of therapeutic gene circumvents these CTL problems. The therapeutic gene 
encodes inhibitory RNA molecules; RNA is not a target for CTL recognition. 

There are several types of inhibitory RNA molecules known: anti-sense RNA, ribozymes, 
competitive decoys and external guide sequences (EGSs). 

External guide sequences, first identified by Forster and Altman (1990), are RNA 
sequences that are capable of directing the cellular protein RNase P to cleave a particular 
RNA sequence. In vzvo, they are found as part of precursor tRNAs where they function to 
direct cleavage by the cellular rifaoprotein RNase P in vivo of the tRNA precursor to form 
mature tRNA. However, in principle, any RNA can be targeted by a custom-designed EGS 
RNA for specific cleavage by RNase P in vitro or in vivo. For example. Yuan et al. (1992) 
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demonstrate a reduction in the levels of chloramphenicol activity in cells in tissue culture 
as a result of introducing an appropriately designed EGS- 

In recent years a number of laboratories have developed retroviral vector systems based on 
5 HIV. In the context of anti-HTV gene therapy these vectors have a number of advantages 
over the more conventional murine based vectors such as murine leukaemia virus (MLV) 
vectors. Firstly, HIV vectors would target precisely those cells that are susceptible to HIV 
infection. Secondly, the HIV-based vector would transduce cells such as macrophages that 
are normally refractory to transduction by murine vectors. Thirdly, the anti-HIV vector 
10 genome would be propagated through the CD4+ cell population by any virus (HIV) that 
escaped the therapeutic strategy. This is because the vector genome has the packaging 
signal that will be recognised by the viral particle packaging system. These various 
attributes make HIV-vectors a powerful tool in the field of anti-HIV gene therapy. 

15 A combination of inhibitory RNA molecules and an HIV-based vector would be attractive 
as a therapeutic strategy. However, xmtii now this has not been possible. Vector particle 
production takes place in producer ceils which express the packaging components of the 
particles and package the vector genome. The inhibitory RNA sequences that are designed 
to destroy the viral KNA would therefore also interrupt the expression of the components 

20 of the HIV-based vector system during vector production. The present invention aims to 
overcome this problem. 

Summarv of the Invention 

25 It is therefore an object of the invention to provide a system and method for producing viral 
particles, in particular HIV particles, which carry nucleotide constructs encoding inhibitory 
RNA molecules such as extemal guide sequences, optionally together with other classes of 
inhibitory RNA molecules such as ribozymes and/or antisense RNAs directed against a 
corresponding virus, such as HIV, within a target cell, that overcomes the above-mentioned 

30 problems. The system includes both a viral genome encoding the inhibitory RNA molcules 
and nucleotide constructs encoding the components required for packaging the viral 
genome in a producer cell However, in contrast to the prior art, although the packaging 
components have substantially the same amino acid sequence as the corresponding 
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components of the target virus, the inhibitory RNA molecules do not affect production of 
the viral particles in the producer cells because the nucleotide sequence of the packaging 
components used in the viral system have been modified to prevent the inhibitory RNA 
molecules from effecting cleavage or degradation of the RNA transcripts produced from 
5 the constructs. Such a viral particle may be used to treat viral infections, in particular HIV 
infections. 

Accordingly the present invention provides a viral vector system comprising: 

(i) a first nucleotide sequence encoding an external guide sequence capable of binding 
10 to and effecting the cleavage by RNase P of a second nucleotide sequence, or transcription 

product thereof, encoding a viral polypeptide required for the assembly of viral particles; 
and 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of viral particles, which third nucleotide sequence has a different nucleotide 

15 sequence to the second nucleotide sequence such that the third nucleotide sequence, or 
transcription product thereof, is resistant to cleavage directed by the external guide 
sequence. 

Preferably, said system fiirther comprises at least one further first nucleotide sequence 
20 encoding a gene product capable of binding to and effecting the cleavage, directly or 
indirectly, of a second nucleotide sequence, or transcription product thereof, encoding a 
viral polypeptide required for the assembly of viral particles, wherein the gene product is 
selected from an external guide sequence, a ribozyme and an anti-sense ribonucleic acid. 

25 In another aspect, the present invention provides a viral vector production system 
comprising: 

(i) a viral genome comprising at least one first nucleotide sequence encoding a gene 
product capable of binding to and effecting the cleavage, directly or indirectly, of a second 
nucleotide sequence, or transcription product thereof, encoding a viral polypeptide required 

30 for the assembly of viral particles; 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third 
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nucleotide sequence, or transcription product thereof, is resistant to cleavage directed by 
said gene product; 

wherein at least one of the gene products is an external guide sequence capable of binding 
to and effecting the cleavage by RNase P of the second nucleotide sequence. 

5 

Preferably, in addition to an external guide sequence, at least one gene product is selected 
from a ribozyme and an anti-sense ribonucleic acid, preferably a ribozyme. 

Preferably, the viral vector is a retroviral vector, more preferably a lentiviral vector, such as 
10 an HTV vector. The second nucleotide sequence and the third nucleotide sequences are 
typically from the same viral species, more preferably from the same viral strain. 
Generally, the viral genome is also from the same viral species, more preferably from the 
same viral strain. 

15 In the case of retroviral vectors, the polypeptide required for the assembly of viral particles 
is selected from gag, pol and env proteins. Preferably at least the gag and pol sequences 
are lentiviral sequences, more preferably HIV sequences. Alternatively, or in addition, the 
env sequence is a lentiviral sequence, more preferably an HTV sequence. 

20 In a preferred embodiment, the third nucleotide sequence is resistant to cleavage directed 
by the gene product as a result of one or more conservative alterations in the nucleotide 
sequence which remove cleavage sites recognised by the at least one gene product and/or 
binding sites for the at least one gene product. For example, where the gene product is an 
EGS, the third nucleotide sequence is adapted to prevent EOS binding and/or to remove the 

25 RNase P consensus cleavage site. Alternatively, where the gene product is a ribozyme, the 
third nucleotide sequence is adapted to be resistant to cleavage by the ribozyme. 

Preferably the thud nucleotide sequence is codon optimised for expression in host cells. 
The host cells, which term includes producer cells and packaging cells, are typically 
30 mammalian cells. 



In a particularly preferred embodiment, (i) the viral genome is an HIV genome comprising 
nucleotide sequences encoding anti-HIV EGSs and optionally anti-HIV ribozyme 
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sequences directed against HIV packaging component sequences (such as gag.pol) in a 
target HIV and (ii) tiie viral system for producing packaged HIV particles further comprises 
nucleotide constructs encoding the same packaging components (such as gag.pol proteins) 
as in the target HIV wherein the sequence of the nucleotide constructs is different from that 
5 found in the target HIV so that the anti-HIV EGS and anti-HIV ribozyme sequences cannot 
effect cleavage or degradation of the gag.pol transcripts during production of the HIV 
particles in producer cells. 

The present invention also provides a viral particle comprising a viral vector according to 
.10 the present invention and one or more polypeptides encoded by the third nucleotide 
sequences according to the present invention. For example the present invention provides 
a viral particle produced using the viral vector production system of the invention. 

In another aspect, the present invention provides a method for producing a viral particle 
15 which method comprises introducing into a host cell (i) a viral genome vector according to 
the present invention; (ii) one or more third nucleotide sequences according to the present 
invention; and (iii) nucleotide sequences encoding the other essential viral packaging 
components not encoded by the one or more third nucleotide sequences. 

20 The present invention further provides a viral particle produced using by the method of the 
invention. 

The present invention also provides a pharmaceutical composition comprising a viral 
particle according to the present invention together with a pharmaceutically acceptable 
25 carrier or diluent. 

The viral system of the invention or viral particles of the invention may be used to treat 
viral infections, particularly retroviral infections such as lentiviral infections including HIV 
infections. Thus the present invention provides a method of treating a viral infection which 
30 method comprises administering to a human or animal patient suffering from the viral 
infection an effective amount of a viral system, viral particle or pharmaceutical 
composition of the present invention. 
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The invention relates in particular to HTV-based vectors carrying anti-HIV EGSs. 
However, the invention can be applied to any other virus, in particular any other lentivirus, 
for which treatment by gene therapy may be desirable. The invention is illustrated herein 
for HIV, but this is not considered to limit the scope of the invention to HIV-based anti- 
5 HIV vectors. 

Detailed Description of the Invention 

The term "viral vector" refers to a nucleotide construct comprising a viral genome capable 
10 of being transcribed in a host cell, which genome comprises sufficient viral genetic 
information to allow packaging of the viral RNA genome, in the presence of packaging 
components, into a viral particle capable of infecting a target cell. Infection, of the target 
cell includes reverse transcription and integration into the target cell genome, where 
appropriate for particular viruses. The viral vector in use typically carries heterologous 
15 coding sequences (nucleotides of interest) which are to be delivered by the vector to the 
target cell, for example a first nucleotide sequence encoding an EGS. A viral vector is 
incapable of independent replication to produce infectious viral particles within the final 
target cell. 

20 The term " vural vector system" is intended to mean a kit of parts which can be used when 
combined with other necessary components for viral particle production to produce viral 
particles in host cells. For example, the first nucleotide sequence may typically be present 
in a piasmid vector construct suitable for cloning the first nucleotide sequence into a viral 
genome vector construct. When combined in a kit with a third nucleotide sequence, which 

25 will also typically be present in a separate piasmid vector construct, the resulting 
combination of piasmid containing the first nucleotide sequence and piasmid containing 
the third nucleotide sequence comprises the essential elements of the invention. Such a kit 
may then be used by the skilled person in the production of suitable viral vector genome 
constructs which when transfected into a host cell together with the piasmid containing the 

30 third nucleotide sequence, and optionally nucleic acid constructs encoding other 
components required for viral assembly, will lead to the production of infectious viral 
particles. 
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Altemativeiy, the third nucleotide sequence may be stably present within a packaging cell 
line that is included in the kit. 



The kit may include the other components needed to produce viral particles, such as host 
5 ceils and other plasmids encoding essential viral polypeptides required for viral assembly. 
By way of example', the kit may contain (i) a plasmid containing a first nucleotide sequence 
encoding an anti-HIV EGS and (ii) a plasmid containing a third nucleotide sequence 
encoding a modified HIV gag.pol construct which cannot be cleaved by the anti-HIV 
ribozyme. Optional components would then be (a) an HIV viral genome construct with 
10 suitable restriction enzyme recognition sites for cloning the first nucleotide sequence into 
the viral genome; (b) a plasmid encoding a VSV-G env protein. Altematively, nucleotide 
sequence encoding viral polypeptides required for assembly of viral particles may be 
provided in the kit as packaging cell lines comprising the nucleotide sequences, for 
example a VSV-G expressing cell line. 

15 

The term 'Viral vector production system" refers to the viral vector system described above 
wherein the first nucleotide sequence has already been inserted into a suitable viral vector 
genome. 

20 Viral vectors are typically retroviral vectors, in particular lentiviral vectors such as HIV 
vectors. The retroviral vector of the present invention may be derived from or may be 
derivable from any suitable retrovirus. A large number of different retroviruses have been 
identified. Examples include: murine leukemia virus (MLV), human immunodeficiency 
virus (HIV), simian immunodeficiency virus, human T-cell leukemia virus (HTLV). 

25 equine infectious anaemia virus (EIAV), mouse mammary tumour virus (MMTV), Rous 
sarcoma virus (RSV), Fujinami sarcoma virus (FuSV), Moloney murine leukemia virus 
(Mo-MLV), FBR murine osteosarcoma virus (FBR MS V), Moloney murine sarcoma virus 
(Mo-MSV), Abelson murine leukemia virus (A-MLV), Avian myelocytomatosis virus-29 
(MC29), and Avian erythroblastosis virus (AEV). A detailed list of retroviruses may be 

30 found in Coffin et al, 1997, "Retroviruses", Cold Spring Harbour Laboratory Press Eds: 
JM CofFm, SM Hughes, HE Varmus pp 758-763. 
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Details on the genomic structure of some retroviruses may be found in the art. By way of 
example, details on HIV and Mo-MLV may be found from the NCBI Genbank (Genome 
Accession Nos. AF033819 and AF03381 1, respectively). 



5 The lentivirus group can be split eVen further into "primate" and "non-primate". Examples 
of primate lentiviruses include human immunodeficiency virus (HIV), the causative agent 
of human auto-immunodeficiency syndrome (AIDS), and simian immunodeficiency virus 
(SIV). The non-primate lentiviral group includes the prototype "slow virus" visna/maedi 
virus (VMV), as well as the related caprine arthritis-encephalitis vims (CAEV), equine 
10 infectious anaemia vims (EIAV) and the more recently described feline immunodeficiency 
virus (FIV) and bovine immunodeficiency virus (BIV), 

The basic stmcture of a retrovirus genome is a 5' LTR and a 3' LTR, between or within 
which are located a packaging signal to enable the genome to be packaged, a piimer 
15 binding site, integration sites to enable integration into a host cell genome and gag, pol and 
env genes encoding the packaging components - these are polypeptides required for the 
assembly of viral particles. More complex retroviruses have additional features, such as 
rev and RRE sequences in HIV, which enable the efficient export of RNA transcripts of the 
integrated provirus from the nucleus to the cytoplasm of an infected target ceil. 

20 

In the provirus, these genes are flanked at both ends by regions called long terminal repeats 
(LTRs). The LTRs are responsible for proviral integration, and transcription. LTRs also 
serve as enhancer-promoter sequences and can control the expression of the viral genes. 
Encapsidation of the retroviral RNAs occurs by virtue of a psi sequence located at the 5' 
25 end of the viral genome. 

The LTRs themselves are identical sequences that can be divided into three elements, 
which are called U3, R and U5. U3 is derived from the sequence unique to the 3' end of 
the RNA. R is derived from a sequence repeated at both ends of the RNA and U5 is 
30 derived from the sequence unique to the 5' end of the RNA. The sizes of the three 
elements can vary considerably among different retroviruses. 
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In a defective retroviral vector genome gag, pol and env may be absent or not functional 
The R regions at both ends of the RNA are repeated sequences. U5 and U3 represent 
unique sequences at the 5' and 3' ends of the RNA genome respectively. 



5 In a typical retroviral vector for use in gene therapy, at least part of one or more of the gag, 
pol and env protein coding regions essential for replication may be removed from the virus. 
This makes the retroviral vector replication-defective. The removed portions may even be 
replaced by a nucleotide sequence of interest (NOI), such as a first nucleotide sequence of 
the invention, to generate a virus capable of integrating its genome into a host genome but 

10 wherein the modified viral genome is unable to propagate itself due to a lack of structural 
proteins. When integrated in the host genome, expression of the NOI occurs - resulting in, 
for example, a therapeutic and/or a diagnostic effect. Thus, the transfer of an NOI into a 
site of interest is typically achieved by: integrating the NOI into the recombinant viral 
vector; packaging the modified viral vector into a virion coat; and allowing transduction of 

1 5 a site of interest - such as a targeted cell or a targeted cell population. 

A minimal retroviral genome for use in the present invention will therefore comprise (5') R 
- U5 - one or more first nucleotide sequences - U3-R (3'), However, the plasmid vector 
used to produce the retroviral genome within a host cell/packaging cell will also include 
20 transcriptional regulatory control sequences operably linked to the retroviral genome to 
direct transcription of the genome in a host cell/packaging cell. These regulatory 
sequences may be the natural sequences associated with the transcribed retroviral sequence, 
i.e. the 5' U3 region, or they may be a heterologous promoter such as another viral 
promoter, for example the CMV promoter. 

25 

Some retroviral genomes require additional sequences for efficient virus production. For 
example, in the case of HIV, rev and RRE sequence are preferably included. However the 
requirement for rev and RRE can be reduced or eliminated by codon optimisation. 

30 Once the retroviral vector genome is integrated into the genome of its target cell as proviral 
DNA, the ribozyme sequences need to be expressed. In a retrovirus, the promoter is 
located in the 5' LTR U3 region of the provirus. In retroviral vectors, the promoter driving 
expression of a therapeutic gene may be the native retroviral promoter in the 5' U3 region. 
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or an alternative promoter engineered into the vector. The alternative promoter may 
physically replace the 5' U3 promoter native to the retrovirus, or it may be incorporated at 
a different place within the vector genome such as between the LTRs. 

5 Thus, the first nucleotide sequence will also be operably linked to a transcriptional 
regulatory control sequence to allow transcription of the first nucleotide sequence to occur 
in the target celL The control sequence will typically be active in mammalian cells. The 
control sequence may, for example, be a viral promoter such as the natural viral promoter 
or a CMV promoter or it may be a mammalian promoter. It is particularly preferred to use 

10 a promoter that is preferentially active in a particular cell type or tissue type in which the 
virus to be treated primarily infects. Thus, in one embodiment, a tissue-specific regulatory 
sequences may be used. The regulatory control sequences driving expression of the one or 
more j&rst nucleotide sequences may be constitutive or regulated promoters. 



15 Replication-defective retroviral vectors are typically propagated, for example to prepare 
suitable titres of the retroviral vector for subsequent transduction, by using a combination 
of a packaging or helper cell line and the recombinant vector. That is to say, that the three 
packaging proteins can be provided in trans, 

20 A "packaging cell line" contains one or more of the retroviral gag, pol and env genes. The 
packaging cell line produces the proteins required for packaging retroviral DNA but it 
cannot bring about encapsidation due to the lack of a psi region. However, when a 
recombinant vector carrying an NOI and a psi region is introduced into the packaging cell 
line, the helper proteins can package the j^^r-positive recombinant vector to produce the 

25 recombinant virus stock. This virus stock can be used to transduce cells to introduce the 
NOI into the genome of the target cells. It is preferred to use a psi packaging signal, called 
psi plus, that contains additional sequences spanning from upstream of the splice donor to 
downstream of the gag start codon (Bender et aL, 1987) since this has been shown to 
increase viral titres, 

30 

The recombinant virus whose genome lacks all genes required to make viral proteins can 
tranduce only once and cannot propagate. These viral vectors which are only capable of a 
single round of transduction of target cells are known as replication defective vectors. 
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Hence, the NOI is introduced into the host/target cell genome without the generation of 
potentially harmful retrovirus. A summary of the available packaging Unes is presented in 
Coffin etal, 1997 {ibid), 

5 Retroviral packaging cell lines in which the gag, pol and env viral coding regions are 
carried on separate expression plasmids that are independently transfected into a packaging 
cell line are preferably used. This strategy, sometimes referred to as the three plasmid 
transfection method (Soneoka et al, 1995) reduces the potential for production of a 
replication-competent virus since three recombinant events are required for wild type viral 

10 ; production. As recombination is greatly facilitated by homology, reducing or eliminating 
homology between the genomes of the vector and the helper can also be used to reduce the 
problem of replication-competent helper virus production. 

An alternative to stably transfected packaging cell lines is to use transiently transfected cell 
15 lines. Transient transfections may advantageously be used to measure levels of vector 
production when vectors are being developed. In this regard, transient transfection avoids 
the longer time required to generate stable vector-producing cell lines and may also be used 
if the vector or retroviral packaging components are toxic to cells. Components typically 
used to generate retroviral vectors include a plasmid encoding the gag/pol proteins, a 
20 plasmid encoding the env protem and a plasmid containing an NOL Vector production 
involves transient transfection of one or more of these components into cells containing the 
other required components. If the vector encodes toxic genes or genes that interfere with 
the replication of the host cell, such as inhibitors of the cell cycle or genes that induce 
apotosis, it may be difficult to generate stable vector-producing cell lines, but transient 
25 transfection can be used to produce the vector before the cells die. Also, cell lines have 
been developed using transient transfection that produce vector titre levels that are 
comparable to the levels obtained firom stable vector-producing cell lines (Pear et aL, 
1993). 



30 



Producer cells/packaging cells can be of any suitable cell type. Most commonly, 
mammalian producer cells are used but other cells, such as insect cells are not excluded. 
Clearly, the producer cells will need to be capable of efficiently translating the env and gag, 
pol mRNA. Many suitable producer/packaging cell lines are known in the art. The skilled 
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person is also capable of making suitable packaging cell lines by, for example stably 
introducing a nucleotide construct encoding a packaging component into a cell line. 



As will be discussed below, where the retroviral genome encodes an inhibitory RNA 
5 molecule capable of effecting the cleavage of gag, pol and/or env RNA transcripts, the 
nucleotide sequences present in the packaging cell line, either integrated or carried on 
piasmids, or in the transiently transfected producer cell line, which encode gag, pol and or 
env proteins will be modified so as to reduce or prevent binding of the inhibitory RNA 
molecule(s). In this way, the inhibitory RNA molecule(s) will not prevent expression of 
10 components in packaging cell lines that are essential for packaging of viral particles. 

It is highly desirable to use high-titre virus preparations in both experimental and practical 
applications. Techniques for increasing viral titre include using a psi plus packaging signal 
as discussed above and concentration of viral stocks. In addition, the use of different 
15 . envelope proteins, such as the G protein from vesicular-stomatitis virus has improved titres 
following concentration to 10^ per ml (Cosset et al,^ 1995). However, typically the 
envelope protein will be chosen such that the viral particle will preferentially infect cells 
that are infected with the virus which it desired to treat. For example where an HIV vector 
is being used to treat HTV infection, the env protein used will be the HIV env protein, 

20 

Suitable first nucleotide sequences for use according to the present invention encode gene 
products that result in the cleavage and/or enzymatic degradation of a target nucleotide 
sequence, which will generally be a ribonucleotide. As particular examples, EGSs, 
ribo2ymes, and antisense sequences may be mentioned, more specifically EGSs. 

25 

External guide sequences (EGSs) are RNA sequences that bind to a complementary target 
sequence to form a loop in the target RNA sequence, the overall structure being a substrate 
' for RNaseP -mediated cleavage of the target RNA sequence. The structure that forms when 
the EGS anneals to the target RNA is very similar to that found in a tRNA precursor. The 
30 the natural activity of RNaseP can be directed to cleave a target RNA by designing a 
suitable EGS. The general rules for EGS design are as follows, with reference to the 
generic EGSs shown in Figure 9B: 
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Rules for EGS design in mammalian cells (see Figure 9B) 

Target sequence - AH tRNA precursor molecules have a G immediately 3 ' of the RNaseP 
cleavage site (i.e. the G forms a base pair with the C at the top of the acceptor stem prior to 
the ACCA sequence). In addition a U is found 8 nucleotides downstream in all tRNAs. 
(i.e. G at position 1, U at position 8). A pyrimidine may be preferred 5' of the cut site. No 
other specific target sequences are required. 

EGS sequence - A 7 nucleotide 'acceptor stem' analogue is optimal (5' hybridising arm). 
A 4 nucleotide 'D-stem' analogue is preferred (3' hybridising arm). Variation in this 
length may alter the reaction kinetics. This will be specific to each target site. A consensus 
'T-stem and loop' analogue is essential. Minimal 5' and 3' non-pairing sequences are 
preferred to reduce the potential for undesired folding of the EGS RNA. 

Deletion of the 'anti-codon stem and loop' analogue may be beneficial. Deletion of the 
variable loop can also be tolerated in vitro but an optimal replacement loop for the deletion 
of both has not been defined in vivo. 

As with ribozymes, described below, it is preferred to use more than one EGS. Preferably, 
a plurality of EGSs is employed, together capable of cleaving gag, pol and env RNA of the 
native retrovirus at a plurality of sites. Since HIV exists as a population of quasispecies, 
not all of the target sequences for the EGSs will be included in all HIV variants. The 
problem presented by this variability can be overcome by using multiple EGs. Multiple 
EGSs can be included in series in a single vector and can function independently when 
expressed as a single RNA sequence. A single RNA containing two or more EGSs having 
different target recognition sites may be referred to as a multitarget EGS. 

Further guidance may be obtained by reference to, for example, Werner et aL (1997); 
Werner et al. (1998); Ma et al (1998) and Kawa et aL (1998). 

Ribozymes are RNA enzymes which cleave RNA at specific sites. Ribozymes can be 
engineered so as to be specific for any chosen sequence containing a ribozyme cleavage 
site. Thus, ribozymes can be engineered which have chosen recognition sites in transcribed 
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viral sequences. By way of an example, ribozymes encoded by the first nucleotide 
sequence recognise and cleave essential elements of viral genomes required for the 
production of viral particles, such as packaging components. Thus, for retroviral genomes, 
such essential elements include the gag, pol and env gene products. A suitable ribozyme 
5 capable of recognising at least one of the gag, pol and env gene sequences, or more 
typically, the RNA sequences transcribed from these genes, is able to bind to and cleave 
such a sequence. This will reduce or prevent production of the gal, pol or env protein as 
appropriate and thus reduce or prevent the production of retroviral particles. 

10 Ribozymes come in several forms, including hammerhead, hairpin and hepatitis delta 
antigenomic ribozymes. Preferred for use herein are hammerhead ribozymes, in part 
because of their relatively small size, because the sequence requirements for their target 
cleavage site are minimal and because they have been well characterised. The ribozymes 
most commonly used in research at present are hammerhead and hairpin ribozymes. 

15 

Each individual ribozyme has a motif which recognises and binds to a recognition site in 
the target RNA, This motif takes the form of one or more "binding arms", generally two 
binding arms. The binding arms in hammerhead ribozymes are the flanking sequences 
Helix I and Helix III, which flank Helix II. These can be of variable length, usually 

20 between 6 to 10 nucleotides each, but can be shorter or longer. The length of the flanking 
sequences can affect the rate of cleavage. For example, it has been found that reducing the 
total number of nucleotides in the flanking sequences from 20 to 12 can increase the 
turnover rate of the ribozyme cleaving a HIV sequence, by 10-fold (Goodchiid et aL, 
1991). A catalytic motif in the ribozyme Helix 11 in hammerhead ribozymes cleaves the 

25 target RNA at a site which is referred to as the cleavage site. Whether or not a ribozyme 
will cleave any given RNA is determined by the presence or absence of a recognition site 
for the ribozyme containing an appropriate cleavage site. 

Each type of ribozyme recognises its own cleavage site. The hammerhead ribozyme 
30 cleavage site has the nucleotide base triplet GUX directly upstream where G is guanine, U 
is uracil and X is any nucleotide base. Hairpin ribozymes have a cleavage site of 
BCUGNYR, where B is any nucleotide base other than adenine, N is any nucleotide, Y is 
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cytosine or thymine and R is guanine or adenine. Cleavage by hairpin ribozymes takes 
places between the G and the N in the cleavage site. 

The nucleic acid sequences encoding the packaging components (the "third nucleotide 
sequences") may be resistant to the ribozyme or ribozymes because they lack any cleavage 
sites for the ribozyme or ribozymes. This prohibits enzymatic activity by the ribozyme or 
ribozymes and therefore there is no effective recognition site for the ribozyme or 
ribozymes. Alternatively or additionally, the potential recognition sites may be altered in 
the flanking sequences which form the part of the recognition site to which the ribozyme 
binds. This either eliminates binding of the ribozyme motif to the recognition site, or 
reduces binding capability enough to destabilise any ribozyme-target complex and thus 
reduce the specificity and catalytic activity of the ribozyme. Where the flanking sequences 
only are altered, they are preferably altered such that catalytic activity of the ribozyme at 
the altered target sequence is negligible and is effectively eliminated. 

Preferably, a series of several anti-HIV ribozymes is employed in the invention. These can 
be any anti-HIV ribozymes but must include one or more which cleave the RNA that is 
required for the expression of gag, pol or env. Preferably, a plurality of ribozymes is 
employed, together capable of cleaving gag, pol and env RNA of the native retrovirus at a 
plurality of sites. Since HIV exists as a population of quasispecies, not all of the target 
sequences for the ribozymes will be included in all HIV variants. The problem presented 
by this variability can be overcome by using multiple ribozymes. Multiple riboz3axies can 
be included in series in a single vector and can function independently when expressed as a 
single RNA sequence. A single RNA containing two or more ribozymes having different 
target recognition sites may be referred to as a multitarget ribozyme. The placement of 
ribozymes in series has been demonstrated to enhance cleavage. The use of a plurality of 
ribozymes is not limited to treating HIV infection but may be used in relation to other 
viruses, retroviruses or otherwise. 

0 Antisense technology is well known on the art. There are various mechanisms by which 
antisense sequences are believed to inhibit gene expression. One mechanism by which 
antisense sequences are believed to function is the recruitment of the cellular protein 
RNaseH to the target sequence/antisense construct heteroduplex which results in cleavage 
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and degradation of the heteroduplex. Thus the antisense construct, by contrast to 
ribozymes, can be said to lead indirectly to cleavage/degradation of the target sequence. 
Thus according to the present invention, a first nucleotide sequence may encode an 
antisense RNA that binds to either a gene encoding an essential/packaging component or 
5 the RNA transcribed from said gene such that expression of the gene is inhibited, for 
example as a result of RNaseH degradation of a resulting heteroduplex. It is not necessary 
for the antisense construct to encode the entire complementary sequence of the gene 
encoding an essential/packaging component - a portion may suffice. The skilled person 
will easily be able to determine how to design a suitable antisense construct. 

10 

By contrast, the nucleic acid sequences encoding the essential/packaging components of 
the viral particles required for the assembly of viral particles in the host cells/producer 
cells/packaging cells (the third nucleotide sequences) are resistant to the inhibitory RNA 
molecules encoded by the first nucleotide sequence. For example in the case of ribozymes, 
15 resistance is typically by virtue of alterations in the sequences which eliminate the 
ribozyme recognition sites. At the same time, the amino acid coding sequence for the 
essential/packaging components is retained so that the viral components encoded by the 
sequences remain the same, or at least sufficiently similar that the function of the 
essential/packaging components is not compromised. 

20 

The term "viral polypeptide required for the assembly of viral particles" means a 
polypeptide normally encoded by the viral genome to be packaged into viral particles, in 
the absence of which the viral genome cannot be packaged. For example, in the context of 
retroviruses such polypeptides would include gag, pol and env. The terms ''packaging 
25 component" and "essential component'' are also included within this definition. 

In the case of antisense sequences, the third nucleotide sequence differs from the second 
nucleotide sequence encoding the target viral packaging component antisense sequence to 
the extent that although the antisense sequence can bind to the second nucleotide sequence, 
30 or transcript thereof, the antisense sequence can not bind effectively to the third nucleotide 
sequence or RNA transcribed from therefrom. The changes between the second and third 
nucleotide sequences will typically be conservative changes, although a small niimber of 
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amino acid changes may be tolerated provided that, as described above, the function of the 
essential/packaging components is not significantly impaired. 



Preferably, in addition to eliminating the inhibitory RNA recognition sites, the alterations 
, 5 to the coding sequences for the viral components improve the sequences for codon usage in 
the mammaUan cells or other ceils which are to act as the producer cells for retroviral 
vector particle production. This improvement in codon usage is referred to as "codon 
optimisation". Many viruses, including HIV and other ientiviruses, use a large number of 
rare codons and by changing these to correspond to commonly used mammalian codons, 
10 increased expression of the packaging components in mammalian producer cells can be 
achieved. Codon usage tables are known in the art for mammalian cells, as well as for a 
variety of other organisms. 

Thus preferably, the sequences encoding the packaging components are codon optimised. 

15 More preferably, the sequences are codon optimised in their entirety. Following codon 
optimisation, it is foiand that there are numerous sites in the wild type gag, pol and env 
sequences which can serve as inhibitory RNA recognition sites and which are no longer 
present in the sequences encoding the packaging components. In an altemative but less 
practical strategy, the sequences encoding the packaging components can be altered by 

20 targeted conservative alterations so as to render them resistant to selected inhibitory RNAs 
capable of effecting the cleavage of the wild type sequences. 

An additional advantage of codon optimising HIV packaging components is that this can 
increase gene expression. In particular, it can render gag, pol expression Rev independent 
25 so that rev and RRE need not be included in the genome (Haas et al, 1996). Rev- 
independent vectors are therefore possible. This in turn enables the use of anti-rev or RRE 
factors in the retroviral vector. 



As described above, the packaging components for a retroviral vector include expression 
30 products of gag, pol and env genes. In accordance with the present invention, gag and pol 
employed in the packaging system are derived from the target retrovirus on which the 
vector genome is based. Thus, in the RNA transcript form, gag and pol would normally be 
cleavable by the ribozymes present in the vector genome. The env gene employed in the 
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packaging system may be derived from a different virus, including other retroviruses such 
as MLV and non-retroviruses such as VSV (a Rhabdovirus), in which case it may not need 
any sequence alteration to render it resistant to cleavage effected by the inhibitory RNA(s). 
Alternatively, env may be derived from the same retrovirus as gag and pol, in which case 
any recognition sites for the inhibitory RNA(s) will need to be eliminated by sequence 
alteration. 

The process of producing a retroviral vector in which the envelope protein is not the native 
envelope of the retrovirus is known as "pseudo typing". Certain envelope proteins, such as 
MLV envelope protein and vesicular stomatitis virus G (VSV-G) protein, pseudotype 
retroviruses very well, Pseudotyping can be useful for altering the target cell range of the 
retrovirus. Alternatively, to maintain target cell specificity for target cells infected with the 
particular virus it is desired to treat, the envelope protein may be the same as that of the 
target virus, for example HIV. 

Other therapeutic coding sequences may be present along with the first nucleotide sequence 
or sequences. Other therapeutic coding sequences include, but are not limited to, 
sequences encoding cytokines, hormones, antibodies, immunoglobulin fiision proteins, 
enzymes, immune co-stimulatory molecules, anti-sense RNA, a transdominant negative 
mutant of a target protein, a toxin, a conditional toxin, an antigen, a single chain antibody, 
tumour suppresser protein and growth factors. When included, such coding sequences are 
operatively linked to a suitable promoter, which may be the promoter driving expression of 
the first nucleotide sequence or a different promoter or promoters. 

Thus the invention comprises two components. The first is a genome construction that will 
be packaged by viral packaging components and which carries a series of anti-viral 
inhibitory RNA molecules such as anti-HIVEGs. These could be any anti-HIV EGSs but 
the key issue for this invention is that some of them result in cleavage of RNA that is 
required for the expression of native or wild type HIV gag, pol or env coding sequences. 
0 The second component is the packaging system which comprises a cassette for the 
expression of HIV gag, pol and a cassette either for HTV env or an envelope gene encoding 
a pseudotyping envelope protein - the packaging system being resistant to the inhibitory 
RNA molecules. 
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The viral particles of the present invention, and the viral vector system and methods used 
to produce may thus be used to treat or prevent viral infections, preferably retroviral 
infections, in particular lentiviral, especially HIV, infections. Specifically, the viral 
5 particles of the invention, typically produced using the viral vector system of the present 
invention may be used to deliver inhibitory RNA molecules to a human or animal in need 
of treatment for a viral infection. 

Alternatively, or in addition, the viral production system may be used to transfect ceils 
10 obtained from a patient ex vivo and then returned to the patient. Patient ceils transfected ex 
vivo may be formulated as a pharmaceutical composition (see below) prior to 
readministration to the patient. 

Preferably the viral particles are combined with a pharmaceutically acceptable carrier or 
1 5 diluent to produce a pharmaceutical composition. Thus, the present invention also provides 
a pharmaceutical composition for treating an individual, wherein the composition 
comprises a therapeutically effective amount of the viral particle of the present invention, 
together with a pharmaceutically acceptable carrier, diluent, excipient or adjuvant. The 
pharmaceutical composition may be for human or animal usage. 

20 

The choice of pharmaceutical carrier, excipient or diluent can be selected with regard to the 
intended route of administration and standard pharmaceutical practice. Suitable carriers and 
diluents include isotonic saline solutions, for example phosphate -buffered saline. The 
pharmaceutical compositions may comprise as - or in addition to - the carrier, excipient or 
25 diluent any suitable binder(s), lubricant(s), suspending agent(s), coating agent(s), 
solubilising agent(s), and other carrier agents that may aid or increase the viral entry into 
the target site (such as for example a lipid delivery system). 

The pharmaceutical composition may be formulated for parenteral, intramuscular, 
30 intravenous, intracranial, subcutaneous, oral, intraocular or transdermal administration. 

Where appropriate, the pharmaceutical compositions can be administered by any one or 
more of: inhalation, in the form of a suppository or pessary, topically in the form of a 
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lotion, solution, cream, ointment or dusting powder, by use of a skin patch, orally in the 
form of tablets containing excipients such as starch or lactose, or in capsules or ovules 
either alone or in admixture with excipients, or in the form of elixirs, solutions or 
suspensions containing flavouring or colouring agents, or they can be injected parenterally, 
5 for example intracavemosally, intravenously, intramuscularly or subcutaneously. For 
parenteral administration, the compositions may be best used in the form of a sterile 
aqueous solution which may contain other substances, for example enough salts or 
monosaccharides to make the solution isotonic with blood For buccal or sublingual 
administration the compositions may be administered in the form of tablets or lozenges 
1 0 which can be formulated in a conventional manner. 

The amount of virus administered is typically in the range of from 10"^ to 10^^ pfu, 
preferably from 10^ to 10^ pfu, more preferably from 10^ to lO'^ pfu. When injected, 
typically 1-10 \il of virus in a pharmaceutically acceptable suitable carrier or diluent is 
15 administered. 

When the polynucleotide/vector is administered as a naked nucleic acid, the amount of 
nucleic acid administered is typically in the range of from 1 p-g to 10 mg, preferably from 
100 |j.g to 1 mg. 

20 

Where the first nucleotide sequence (or other therapeutic sequence) is under the control of 
an inducible regulatory sequence, it may only be necessary to induce gene expression for 
the duration of the treatment. Once the condition has been treated, the inducer is removed 
and expression of the NOT is stopped. This will clearly have clinical advantages. Such a 
25 system may, for example, involve administering the antibiotic tetracycline, to activate gene 
expression via its effect on the tet repressorArP16 fusion protein. 

The invention will now be further described by way of Examples, which are meant to serve 
to assist one of ordinary skill in the art in carrying out the invention and are not intended in 
30 any way to limit the scope of the invention. The Examples refer to the Figures. In the 
Figures: 
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Figure 1 shows schematically ribozymes inserted into four different HIV vectors; 

Figure 2 shows schematically how to create a suitable 3' LTR by PGR; 

5 Figure 3 shows the codon usage table for wild type HIV gag,pol of strain HXB2 (accession 
number: K03455). 

Figure 4 shows the codon usage table of the codon optimised sequence designated gag,poi- 
SYNgp. 

10 

Figure 5 shows the codon usage table of the wild type HIV env called env-mn. 

Figure 6 shows the codon usage table of the codon optimised sequence of HTV env 
designated SYNgpl60mn. 

15 

Figure 7 shows three plasmid constructs for use in the invention. 

Figure 8 shows the principle behind two systems for producing retroviral vector particles. 
20 Figure 9 A shows an EGS based on tyrosyl t-RNA 
Figure 9B shows a consensus EGS sequence. 
Figure 10 shows twelve different anti-HIV EGS constructs. 

25 

Figure 11 is a schematic representation of pDozenEgs and construction of pH4DozenEgs. 

The invention will now be further described in the Examples which follow, which are 
intended as an illustration only and do not limit the scope of the invention. 

30 
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Reference Example 1 - Construction of a Ribozyme-encoding Genome 

5 The HIV gag.pol sequence was codon optimised (Figure 4 and SEQ LD. No. 1) and 
synthesised using overlapping oligos of around 40 nucleotides. This has three advantages. 
Firstly it allows an HIV based vector to carry ribozymes and other therapeutic factors. 
Secondly the codon optimisation generates a higher vector titre due to a higher level of 
gene expression. Thirdly gag.pol expression becomes rev independent which allows the 

10 use of anti-rev or RRE factors. 

Conserved sequences within gag.pol were identified by reference to the HFV Sequence 
database at Los Alamos National Laboratory (http:// hiv-webianl.gov/) and used to design 
ribozymes. Because of the variability between subtypes of HTV-l the ribozymes were 
15 designed to cleave the predominant subtype within North America, Latin America and the 
Caribbean, Europe, Japan and Australia; that is subtype B. The sites chosen were cross- 
referenced with the synthetic gagpol sequence to ensure that there was a low possibility of 
cutting the codon optimised gagpol mRNA. The ribozymes were designed with J^ol and 

20 Sail sites at the 5' and 3' end respectively. This allows the construction of separate and 
tandem ribozymes. 

The ribozymes are hammerhead (Riddell et aL, 1996) structures of the following general 
structure: 

25 

Helix I Helix H Helix EI 

5 ' - NNNNNNNN-- CUGAUGAGGCCGAAAGGCCGAA --NNNNNNNN-- 

The catalytic domain of the ribozyme (Helix II) can tolerate some changes without 
30 reducing catalytic turnover. 

The cleavage sites, targeting gag and pol, with the essential GUX triplet (where X is any 
nucleotide base) are as follows: 
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GAG 


1 


5 ' 


UAGUAAGAAUGUAUAGCCCUAC 


GAG 


2 


5 ' 


AACCCAGAUUGUAAGACUAUUU 


GAG 


3 


5 ' 


UGUUUCAAUUGUGGCAAAGAAG 


GAG 


4 


5 ' 


AAAAAGGGCUGUUGGAAAUGUG 


POL 


1 


5 ' 


AC GAG C C CUCGUC AC AAUAAAG 


POL 


2 


5 ' 


GGAAUUGGAGGUUUUAUCAAAG 


POL 


3 


5 ' 


AUAUUUUUCAGUUCCCUUAGAU 


POL 


4 


5 ' 


UGGAUGAUUUGUAUGUAGGAUC 


POL 


5 


5 ' 


CUUUGGAUGGGUUAUGAACUCC 


POL 


6 


5 ' 


CAGCUGGACUGUCAAUGACAUA 


POL 


7 


5 ' 


AACUUUCUAUGUAGAUGGGGCA 


POL 


8 


5 • 


AAGGCCGCCUGUUGGUGGGCAG 


POL 


9 


5 ' 


UAAGACAGCAGUACAAAUGGCA 



15 

The ribozymes are inserted into four different HTV vectors (pH4 (Gervaix et al, 1997), 
pH6, pH4.1, or pH6.1) (Figure 1). In pH4 and pH6, transcription of the ribozymes is 
driven by an internal HCMV promoter (Foecking et aL, 1986). From pH4.1 and pH6.1, the 
ribozymes are expressed from the 5' LTR. The major difference between pH4 and pH6 
20 (and pH4.1 and pH6T) resides in the 3' LTR in the production plasmid. pH4 and pH4.1 
have the HIV U3 in the 3' LTR. pH6 and pH6.1 have HCMV in the 3'LTR. The HCMV 
promoter replaces most of the U3 and will drive expression at high constitutive levels 
while the HTV-l U3 will support a high level of expression only in the presence of Tat. 

25 The HCMV/HIV-1 hybrid 3' LTR is created by recombinant PGR with three PCR primers 
(Figure 2). The first round of PCR is performed with RIBl and RIB2 using pH4 (Kim et 
aL, 1998) as the template to amplify the HIV-1 HXB2 sequence 8900-9123. The second 
round of PCR makes the junction between the 5' end of the HTV-l U3 and the HCMV 
promoter by amplifying the hybrid 5' LTR from pH4. The PCR product from the first PCR 

30 reaction and RIB3 serves as the 5' primer and 3' primer respectively. 



RIBl 
RIB2 
RIBS 



5' -CAGCTGCTCGAGCAGCTGAAGCTTGCATGC-3 ' 

5' -GTAAGTTATGTAACGGACGATATCTTGTCTTCTT-3' 

5 ' -CGCATAGTCGACGGGCCCGCCACTGCTAGAGATTTTC- 3 ' 
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The PGR product is then cut with Sphl and Sail and inserted into pH4 thereby replacing the 
3' LTR. The resulting plasmid is designated pH6. To construct pH4.1 and pH6.1, the 
internal HCMV promoter (Spel ~ Xhol) in pH4 and pH6 is replaced with the polycloning 
5 site of pBluescript II KS+ (Stratagene) (Spel - Xhol). 

The ribozymes are inserted into the Xhol sites in the genome vector backbones. Any 
ribozymes in any configuration could be used in a similar way. 

10 Reference Example 2 - Construction of a Packaging System 

The packaging system can take various forms. In a first form of packaging system, the HIV 
gag, pol components are co-expressed with the HIV env coding sequence. In this case, 
both the gag, pol and the env coding sequences are altered such that they are resistant to the 

15 anti-HIV ribozymes that are built into the genome. At the same time as altering the codon 
usage to acliieve resistance, the codons can be chosen to match the usage pattern of the 
most highly expressed mammalian genes. This dramatically increases expression levels 
and so increases titre. A codon optimised HIV env coding sequence has been described by 
Haas et al (1996). In the present example, a modified codon optimised HIV env sequence 

20 is used (SEQ I.D. No. 3). The corresponding env expression plasmid is designated 
pSYNgpl60mn, The modified sequence contains extra motifs not used by Haas et al The 
extra sequences were taken from the HIV env sequence of strain MN and codon optimised. 
Any similar modification of the nucleic acid sequence would function similarly as long as 
it used codons corresponding to abundant tRNAs (Zolotukhin et al., 1996) and lead to 

25 resistance to the ribozymes in the genome. 

In one example of a gag, pol coding sequence with optimised codon usage, overlapping 
oligonucleotides are synthesised and then ligated together to produce the synthetic coding 
sequence. The sequence of a wild-type (Genbank accession no. K03455) and synthetic 
30 (gagpoi-SYNgp) gagpol sequence is shown in SEQ I.D. Nos 1 and 2, respectively and their 
codon usage is shown in Figures 3 and 4, respectively. The sequence of a wild type env 
coding sequence (Genbank Accession No. Ml 7449) is given in SEQ I.D. No 3, the 
sequence of a synthetic codon optimised sequence is given in SEQ. I.D. No. 4 and their 
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codon usage tables are given in Figures 5 and 6, respectively. As with the env coding 
sequence any gag, pol sequence that achieves resistance to the ribozymes could be used. 
The synthetic sequence shown is designated gag, pol-SYNgp and has an EcoRl site at the 5' 
end and a Notl site at the 3' end. It is inserted into pClneo (Promega) to produce piasmid 
5 pSYNgp. 

The sequence of the codon optimised gagpol sequence is shown in SEQ I.D. No. 2. This 
sequence starts at the ATG and ends at the stop codon of gagpoi. The wild type sequence 
is retained around the firameshift site so that the right amount of gagpol is made. 

10 

In addition other constructs can be used that contain the optimised gagpoi of pSYNgp but 
also have differing amounts of the wild type HFV 1 sequence of strain HXB2 (accession 
number: K03455) at the 5' end. These constructs are described below (the start ATG of 
pSYNgp is shown in bold in these sequences). 

15 

pSYNgp2 contains the entire leader sequence of HIV- 1 (SEQ ID. No. 12), 

pSYNgpS contains the leader sequence of HIV-1 from the major splice donor (SEQ ID. 

No. 13). 

pSYNgp4 contains 20pb of the leader sequence of HIV-1 upstream of the start codon of 
20 ATG (SEQ ID. No. 14). 

These constructs may be made by overlapping PGR. Using appropriate restriction enzymes 
these sequences can be inserted into mammalian expression vectors such as pCI-Neo 
(Promega). All these gag/pol constructs can be used to supply HIV gag/poi for the 
25 generation of viral vectors. These viral vectors can be used to express either EGS 
molecules or ribozyme molecules or antisense molecules or any peptides or proteins. 

In a second form of the packaging system a synthetic gag, pol cassette is coexpressed with 
a non-HIV envelope coding sequence that produces a surface protein that pseudotypes 
30 HIV, This could be for example VSV-G (Cry et al, 1996; Zhu et aL, 1990), amphotropic 
MLV env (Chesebro et aL, 1990; Specter et aL, 1990) or any other protein that would be 
incorporated into the HIV particle (Valsesia-Wittman, 1994). This includes molecules 
capable of targeting the vector to specific tissues. Coding sequences for non-HIV envelope 
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proteins not cleaved by the ribozymes and so no sequence modification is required 
(although some sequence modification may be desirable for other reasons such as 
optimisation for codon usage in mammalian cells). 

5 Reference Example 3 - Vector Particle Production 



Vector particles can be produced either from a transient three-plasmid transfection system 
similar to that described by Soneoka et al (1995) or from producer cell lines similar to 
those used for other retroviral vectors (Ory et aL^ 1996; Srinivasakumar et aL, 1991 \ Yu et 

10 a/., 1996). These principles are illustrated in Figures 7 and 8. For example, by using 
pH6Rz, pSYNgp and pRV67 (VSV-G expression plasmid) in a three plasmid transfection 
of 293T cells (Figure 8), as described by Soneoka et al (1995), vector particles designated 
H6Rz-VSV are produced. These transduce the H6Rz genome to CD4+ cells such as 
CI 866 or Jurkat and produce the multitarget ribozymes. HIV replication in these cells is 

15 now severely restricted. 

Example 1 - Use of external guide sequences for inhibiting HTV 

Ribonuclease P is a nuclear localised enzyme consisting of protein and RNA subunits. It 
20 has been found in all organisms examined and is one of the most abundant, stable and 
efficient enzymes in cells. Its enzymatic activity is responsible for the maturation of the 5' 
termini of all tRNAs which account for about 2% of the total cellular RNA. 

For tRNA processing, it has been shown that RNAse P recognises a secondary structure of 
25 the tRNA. However extensive studies have shown that any complex of two RNA 
molecules which resemble the one tRNA molecule will also be recognised and cleaved by 
RNase P. Consequently the natural activity of RNase P can and has been successfully re- 
directed to target other RNA species (see Yaun and Altman, 1994, and references therein). 
This is achieved by engineering a sequence, containing the flanking motif recognised by 
30 RNaseP, to bind the desired target sequence. These sequences are called external guide 
sequence (EGSs). 
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Outlined here is a strategy employing the EGS system against HIV RNA. Shown in Figure 
2 A, B and C are twelve EGS sequences designed to target twelve separate HIV gag/pol 
sequences. These target sequences are conserved throughout the clade B of HIV. The 
sequence numbering in each figure designates the position of the required conserved G of 
5 each target sequences based on the HXB2 published sequence. 

The external guide sequences shown here all have anticodon stem-loops deleted. These are 
non-limiting examples; for instance full length 3/4 tRNA based EGSs might be used if 
preferred (see Yuan and Altman, 1994). 

10 

Outlined in SEQ ID. Nos, 5 to 10 (see below) and Figure 11 is the cloning strategy 
employed to construct an HIV vector containing the EGSs described in SEQ ID. Nos. 5 to 
10. The oligonucleotides prefixed 1, 2, 3, 4, 5 and 6 are respectively annealed together and 
sequentially cloned into the pSP72 (Promega) cloning vector starting with the oligo. duplex 

15 1/1 A being cloned into the Xhol-Sall site such that the EGS 4762 and EGS 4715 are 
orientated away from the ampicillin gene. The remaining oligonucleotides (with^al ends) 
are subsequently cloned stepwise (starting with oligo. duplex 2/2A, ending with duplex 
6/6A) into the unique Sail site (present within the terminus of the each preceding 
oligonucleotide) to create the plasmid pDOZENEGS. The EGSs from this vector are then 

20 transferred by Xhol-Sphl digest into the pH4Z similarily cut such that the multiple EGSs 
cassette replaces the lacZ gene of pH4Z (Kim et aL, 1998), The resulting vector is named 
pH4D0ZENEGS (see SEQ ID. No. 1 1 for complete sequence). 

Egs l/lA(SEQID.No, 5) 

25 

Xhol 

5 ' - tcgagcccggggatgacgtcatcgacttcgaaggttcgaatccttctactgccaccatttttt 
cgggcccctactgcagtagctgaagcttccaagcttaggaagatgacggtggtaaaaaa 

ctctacgtcatcgacttcgaaggtt:cgaatcct:tccctgtccaccagtcgacc-"3 ' 
30 gagatgcagtagctgaagcttccaagcttaggaagggacaggtggtcagctggagct-5 ' 

Egs 2/2A (SEQ ID. No. 6) 



35 5' - tcgagtattacgtcatcgacttcgaaggttcgaatccttctagattcaccattittttaggaacg 
cataatgcagtagctgaagcttccaagcttaggaagtactaagtggtaaaaaatccttgc 
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tcatcgactt:cgaaggttcgaatccttccagttccaccagtcgacc-3' 
agtagctgaagcttccaagcttaggaaggtcaaggtggtcagctggagct-5' 

5 Egs 3/3A (SEQ ID. No. 7) 

5 ' - tcgaggccaacgtcatcgacttcgaaggttcgaatccttctcttcccaccattttttttcc 
ccggttgcagtagctgaagcttccaagcttaggaagagaagggtggtaaaaaaaagg 

acgtcatcgacttcgaaggttGgaatccttcggggcccaccagtcgacc-3 ' 
10 tgcagtagctgaagcttccaagcttaggaagccccgggtggtcagctggagct-5 ' 

Egs 4/4A (SEQ ID. No. 8) 

5' - tcgagggctacgtcatcgacttcgaaggttcgaatccttcttgcttcaccatttttt 
cccgatgcagtagctgaagcttccaagcttaggaagaacgaagtggtaaaaaa 

ctgaacgtcatcgacttcgaaggttcgaatccttctgctgtcaccagtcgacc-3' 
gacttgcagtagctgaagcttccaagcttaggaagacgacagtggtcagctggagct-5' 

Egs5/5A (SEQ ID. No. 9) 

5 ' - tcgagtataacgtcatcgacttcgaaggttcgaatccttcaccggtcaccatttttttata 
catattgcagtagctgaagcttccaagct:taggaagtggccagt.ggtaaaaaaatafc 

acgtcatcgacttcgaaggttcgaatccttcttcttacaccagtcgacc-3' 
tgcagtagctgaagcttccaagcttaggaagaagaatgtggtcagctggagct-S' 

Egs6/6A (SEQ ID. No. 10) 

5 ' - tcgaggtacacgtcatcgacttcgaaggttcgaatccttcgtagttcaccattttttgtgc 
ccatgtgcagtagctgaagcttccaagcttaggaagcatcaagtggtaaaaaacacg 

Spill 

acgtcatcgact:tcgaaggttcgaat:ccttctaggcccaccagtcgacgcatgcc-3' 
tgcagtagctgaagcttccaagcttaggaagatccgggtggtcagctgcgtacggagct-S' 

The pH4DOZENEGS.vector may be used to both deliver and express the example EGS 
sequences to appropriate eukaryotic cells in a manner as described for ribozymes in 
reference examples 1, 2 and 3 whereby the use of a codon optimised gag/pol and env genes 
40 would prevent EGSs from targeting these genes during viral production. The inclusion of 
the EGS sequences into an HTV derived vector will not only allow expression of such 
sequences in the target cell but also packaging and transfer of such therapeutic sequences 
by the patient's own HIV. These example EGS sequences target HIV RNA for cleavage by 
RNAse P. This example is not limiting and other suitable EGS and derived sequences may 
45 also be used; be they expressed singularly, in multiples, from pol I, pol II or pol III 
promoters and derivatives thereof and/or in combination with other HIV treatments. Other 



15 



20 



25 



30 



35 



wo 00/55341 PCT/GBOO/01002 

-29- 

appropriate nucleotide sequences of interest (NOls) may also be included in combination 
with EGSs if preferred. 

All publications mentioned in the above specification are herein incorporated by reference. 

5 Various modifications and variations of the described methods and system of the invention 
will be apparent to those skilled in the art without departing from the scope and spirit of the 
invention. Although the invention has been described in cormection with specific preferred 
embodiments, it should be understood that the invention as claimed should not be unduly 
limited to such specific embodiments. Indeed, various modifications of the described 

10 modes for carrying out the invention which are obvious to those skilled in molecular 
biology or related fields are intended to be within the scope of the following claims. 
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1 . A viral vector system comprising: 

(i) a first nucleotide sequence encoding an external guide sequence capable of binding 
to and effecting the cleavage by RNase P of a second nucleotide sequence, or transcription 
product thereof, encoding a viral polypeptide required for the assembly of viral particles; 
and 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of viral particles, which third nucleotide sequence has a different nucleotide 
sequence to the second nucleotide sequence such that the third nucleotide sequence, or 
transcription product thereof, is resistant to cleavage directed by the external guide 
sequence. 

2. A system according to claim 1 further comprising at least one further first 
nucleotide sequence encoding a gene product capable of binding to and effecting the 
cleavage, directly or indirectly, of a second nucleotide sequence, or transcription product 
thereof, encoding a viral polypeptide required for the assembly of viral particles, wherein 
the gene product is selected from an external guide sequence, a ribozyme and an anti-sense 
ribonucleic acid. 

3 . A viral vector production system comprising: 

(i) a viral genome comprising at least one first nucleotide sequence encoding a gene 
product capable of binding to and effecting the cleavage, directly or indirectly, of a second 
nucleotide sequence, or transcription product thereof, encoding a viral polypeptide required 
for the assembly of viral particles; 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third 
nucleotide sequence, or transcription product thereof, is resistant to cleavage directed by 
said gene product; 

wherein at least one of the gene products is an external guide sequence capable of binding 
to and effecting the cleavage by RNase P of the second nucleotide sequence. 
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4. A system according to claim 3 wherein in addition to an external guide sequence, at 
least one gene product is selected from a ribozyme and an anti-sense ribonucleic acid, 

5. A system according to any one of claims 1 to 4 wherein the viral vector is a 
retroviral vector. 

6. A system according to claim 5 wherein the retroviral vector is a lenti viral vector. 

7. A system according to claim 6 wherein the lenti viral vector is an HIV vector. 

8. A system according to any one of claims 5 to 7 wherein the polypeptide required for 
the assembly of viral particles is selected from gag, pol and env proteins. 

9. A system according to claim 8 wherein at least the gag and pol proteins are from a 
lentivirus. 

10. A system according to claim 7 wherein the env protein is from a lentivirus. 

11. A system according to claim 9 or 10 wherein the lenti vims is HIV. 

12. A system according to any one of the preceding claims wherein the third nucleotide 
sequence is resistant to cleavage directed by the gene product as a result of one or more 
conservative alterations in the nucleotide sequence which remove cleavage sites recognised 
by the at least one gene product and/or binding sites for the at least one gene product 

13. A system according to any one of claims 1 to 11 wherein the third nucleotide 
sequence is adapted to be resistant to cleavage by the at least one gene product. 

14. A system according to any one of the preceding claims wherein the third nucleotide 
sequence is codon optimised for expression in producer cells. 

15. A system according to claim 14, wherein the producer cells are mammalian cells. 
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16. A system according to any one of the preceding claims comprising a plurality of 
first nucleotide sequences and third nucleotide sequences as defined therein. 

17. A viral particle comprising a viral vector genome as defined in any one of claims 3 
to 1 6 and one or more third nucleotide sequences as defined in any of claims 3 to 16. 

18. A viral particle produced using a viral vector production system according to any 
one of claims 3 to 16. 

19. A method for producing a viral particle which method comprises introducing into a 
host cell (i) a viral genome as defined in any one of claims 3 to 1 6 (ii) one or more third 
nucleotide sequences as defined in any of claims 3 to 16 and (iii) nucleotide sequences 
encoding the other essential viral packaging components not encoded by the one or more 
third nucleotide sequences. 

20. A viral particle produced by the method of claim 19. 

21. A pharmaceutical composition comprising a viral particle according to claims 17, 
1 8 or 20 together with a pharmaceutically acceptable carrier or diluent, 

22. A viral system according to any one of claims 1 to 1 7 or a viral particle according to 
claims 17, 18 or 20 in treating a viral infection. 

23. A viral system according to any one of claims 1 to 17 for use in a method of 
producing viral particles. 
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Figure 9 B 



Generic design of EGSs to tars;et anv RNA. 
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Atty. Dkt. No. 078883-0137 

DECLARATION AND POWER OF ATTORNEY 

As a below named inventor, I HEREBY DECLARE: 

THAT my residence, post office address, and citizenship are as stated below next to my name; 

THAT I believe I am the original, first, and sole inventor (if only one inventor is named below) or an 
original, first, and joint inventor (if plural inventors are named below or in an attached Declaration) of the 
subject matter which is claimed and for which a patent is sought on the invention entitled 

ANTI- VIRAL VECTORS 



(Attorney Docket No. 078883-0137) 

the specification of which (check one) 

is attached hereto. 

XX was filed on March 17, 2000 as United States Application Number or PCT 

International Application Number PCT/GBOO/01002 and was amended on _ 
_ (if applicable). 

THAT I do not know and do not believe that the same invention was ever known or used by others in 
the United States of America, or was patented or described in any printed publication in any country, before I 
(we) invented it; 

THAT I do not know and do not believe that the same invention was patented or described in any 
printed publication in any country, or in public use or on sale in the United States of America, for more than one 
year prior to the filing date of this United States application; 

THAT I do not know and do not believe that the same invention was first patented or made the subject 
of an inventor's certificate that issued in any country foreign to the United States of America before the filing 
date of this United States application if the foreign application was filed by me (us), or by my (our) legal 
representatives or assigns, more than twelve months (six months for design patents) prior to the filing date of 
this United States application; 

THAT I have reviewed and understand the contents of the above-identified specification, including the 
claim(s), as amended by any amendment specifically referred to above; 

THAT I believe that the above-identified specification contains a written description of the invention, 
and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable 
any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the 
invention, and sets forth the best mode contemplated by me of carrying out the invention; and 

THAT I acknowledge the duty to disclose to the U.S. Patent and Trademark Office all information 
known to me to be material to patentability as defined in Title 37, Code of Federal Regulations, §1.56, 

I HEREBY CLAIM foreign priority benefits under Title 35, United States Code §119(a)-(d) or 
§ 365(b) of any foreign application(s) for patent or inventor's certificate, or §365(a) of any PCT international 
application which designated at least one country other than the United States of America, listed below and have 
also identified below any foreign application for patent or inventor's certificate or of any PCT international 
application having a filing date before that of the application on which priority is claimed. 



Page 1 of 3 



002.637443.1 




Atty. Dkt. No. 078883-0137 



Prior Foreign 
Application Number 


Country 


Foreign Filing Date 


Priority 
Claimed? 


Certified j 

Copy 
Attached? 


1 9906177.2 


Great Britain 


03/17/1999 


YES 

























I HEREBY CLAIM the benefit under Title 35, United States Code § 1 19(e) of any United States 
provisional application(s) listed below. 



U.S. Provisional Application Number 


Filing Date 















I HEREBY CLAIM the benefit under Title 35, United States Code, §120 of any United States 
application(s), or § 365(c) of any PCT international application designating the United States of America, listed 
below and, insofar as the subject matter of each of the claims of this application is not disclosed in the prior 
United States or PCT International application in the manner provided by the first paragraph of Title 35, United 
States Code, § 1 12, 1 acknowledge the duty to disclose information which is material to patentability as defined 
in Title 37, Code of Federal Regulations, § 1.56 which became available between the filing date of the prior 
application and the national or PCT international filing date of this application. 



U.S. Parent 
Application Number 


PCT Parent 
Application Number 


Parent 
Filing Date 


Parent 
Patent Number 



























I HEREBY APPOINT the following registered attorneys and agents of the law firm of FOLEY & 
LARDNER: 




STEPHEN A. BENT 


Reg. No. 


29,768 


DAVID A. BLUMBNTHAL 


Reg. No. 




BETH A. BURROUS 


Reg. No. 


35^- 


ALAN I. CANTOR 


Reg. No. 




WILLIAM T- ELLIS 


Reg. No. 


Mm 


JOHN J. FELDHAUS 


Reg. No, 




MICHAEL D. KAMINSKI 


Reg. No. 




LYLE K. KIMMS 


Reg. No. 


Mm 


KENNETH E. KROSIN 


Reg. No. 




JOHNNY A. KUMAR 


Reg. No. 




JACK LAHR 


Reg. No. 


19,621 


GLENN LAW 


Reg. No. 




PETER G. MACK 


Reg. No. 




STEPHEN B. MAEBIUS 


Reg. No. 




BRIAN J. MC NAMARA 


Reg. No. 


320^ 


SYBIL MELOY 


Reg. No. 


22^ 


RICHARD C. PEET 


Reg. No. 




GEORGE E. QUILLIN 


Reg. No. 


_^722 


ANDREW E. RAWLINS 


Reg. No. 


"34,702 
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BERNHARD D. SAXE 
CHARLES F. SCHILL 
RICHARD L. SCHWAAB 
MICHELE M. SIMKIN 
HAROLD C. WEGNER 



Reg. No. 28,665 
Reg. No. 
Reg. No. 

Reg. No. -^3^713 

Reg. No. T5^8 



to have full power to prosecute this application and any continuations, divisions, reissues, and reexaminations 
thereof, to receive the patent, and to transact all business in the United States Patent and Trademark Office 
connected therewith. 



I request that all correspondence be directed to: 



WasMngtonjla^^ 
3Q00KSgeeLR^^ 
Washington, d7c. 20007-5 1 09 



Telephone: 
Facsimile: 



(202) 672-5427 
(202) 672-5399 



I UNDERSTAND AND AGREE THAT the foregoing attorneys and agents appointed by me to 
prosecute this application do not personally represent me or my legal interests, but instead represent the interests 
of the legal owner(s) of the invention described in this application. 

I FURTHER DECLARE THAT all statements made herein of my own knowledge are true, and that all 
statements made on information and belief are believed to be true; and further that these statements were made 
with the knowledge that willful false statements and the like so made are punishable by fine or imprisoimient, or 
both, under Section 1001 of Title 18 of the United States Code, and that such willful false statements may 
jeopardize the validity of the application 9^,any patent issuing thereon. 



Name of first inventor 
Residence 
Citizenship 

Post Office Address 

Inventor's signature 
Date 

Name of second inventor 
Residence 
Citizenship 
Post Office Address 
Inventor's signature 
Date 



Mark UDEN 



London, Great Britain 



British 



Flat 2, Finsbury Park 
17 Sommerfield Road 
jCbndon, 2JN Great Britain 



Kyriacos MITROPHANOUS 



^Oxtod, Great Britain ^C^^ 



f 



British 



85 Warwick Street 

Oxford, OX4 ISZ Great Britain 
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SEQUENCE LISTING 



<110> UDEN, MARK 

MITROPHANOUS , KYRIACOS 



<12 0> ANTI -VIRAL VECTORS 



<130> 078883/0137 

<140> 09/936,572 
<141> 2001-09-14 

<150> PCT/GBOO/01002 
<151> 2000-03-17 



<150> GB 9906177.2 
<151> 1999-03-17 



<160> 73 



<170> Patentin Ver. 2.1 



<210> 1 
<211> 4307 
<212> DNA 

<213> Human immunodeficiency virus type 1 



<400> 1 

atgggtgcga 

ttaaggccag 

ctagaacgat 

ctgggacagc 

acagtagcaa 

ttagacaaga 

gacacaggac 

caaatggtac 

gagaaggctt 

ccacaagatt 

ttaaaagaga 

gggcctattg 

agtacccttc 

atttataaaa 

agcattctgg 

tataaaactc 

ttgttggtcc 

gctacactag 

agagttttgg 

ggcaatttta 

acagccagaa 

caccaaatga 

tacaagggaa 

gagagcttca 

aaggaactgt 

taaagatagg 

tagaagaaat 

ttatcaaagt 

gtacagtatt 

ttggttgcac 



gagcgtcagt 
ggggaaagaa 
tcgcagttaa 
tacaaccatc 
ccctctattg 
tagaggaaga 
acagcaatca 
atcaggccat 
tcagcccaga 
taaacaccat 
ccatcaatga 
caccaggcca 
aggaacaaat 
gatggataat 
acataagaca 
taagagccga 
aaaatgcgaa 
aagaaatgat 
ctgaagcaat 
ggaaccaaag 
attgcagggc 
aagattgtac 
ggccagggaa 
ggtctggggt 
atcctttaac 
ggggcaacta 
gagtttgcca 
aagacagtat 
agtaggacct 
tttaaatttt 



attaagcggg 
aaaatataaa 
tcctggcctg 
ccttcagaca 
tgtgcatcaa 
gcaaaacaaa 
ggtcagccaa 
atcacctaga 
agtgataccc 
gctaaacaca 
ggaagctgca 
gatgagagaa 
aggatggatg 
cctgggatta 
aggaccaaag 
gcaagcttca 
cccagattgt 
gacagcatgt 
gagccaagta 
aaagattgtt 
ccctaggaaa 
tgagagacag 
ttttcttcag 
agagacaaca 
ttccctcagg 
aaggaagctc 
ggaagatgga 
gatcagatac 
acacctgtca 
cccattagcc 



ggagaattag 
ttaaaacata 
ttagaaacat 
ggatcagaag 
aggatagaga 
agtaagaaaa 
aattacccta 
actttaaatg 
atgttttcag 
gtggggggac 
gaatgggata 
ccaaggggaa 
acaaataatc 
aataaaatag 
gaacccttta 
caggaggtaa 
aagactattt 
cagggagtag 
acaaattcag 
aagtgtttca 
aagggctgtt 
gctaattttt 
agcagaccag 
actccccctc 
tcactctttg 
tattagatac 
aaccaaaaat 
tcatagaaat 
acataattgg 
ctattgagac 



atcgatggga 
tagtatgggc 
cagaaggctg 
aacttagatc 
taaaagacac 
aagcacagca 
tagtgcagaa 
catgggtaaa 
cattatcaga 
atcaagcagc 
gagtgcatcc 
gtgacatagc 
cacctatccc 
taagaatgta 
gagactatgt 
aaaattggat 
taaaagcatt 
gaggacccgg 
ctaccataat 
attgtggcaa 
ggaaatgtgg 
tagggaagat 
agccaacagc 
agaagcagga 
gcaacgaccc 
aggagcagat 
gataggggga 
ctgtggacat 
aagaaatctg 
tgtaccagta 



aaaaattcgg 
aagcagggag 
tagacaaata 
attatataat 
caaggaagct 
agcagcagct 
catccagggg 
agtagtagaa 
aggagccacc 
catgcaaatg 
agtgcatgca 
aggaactact 
agtaggagaa 
tagccctacc 
agaccggttc 
gacagaaacc 
gggaccagcg 
ccataaggca 
gatgcagaga 
agaagggcac 
aaaggaagga 
ctggccttcc 
cccaccagaa 
gccgatagac 
ctcgtcacaa 
gatacagtat 
attggaggtt 
aaagctatag 
ttgactcaga 
aaattaaagc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 



caggaatgga tggcccaaaa gttaaacaat ggccattgac agaagaaaaa ataaaagcat 186 0 
tagtagaaat ttgtacagag atggaaaagg aagggaaaat ttcaaaaatt gggcctgaaa 192 0 
atccatacaa tactccagta tttgccataa agaaaaaaga cagtactaaa tggagaaaat 1980 
tagtagattt cagagaactt aataagagaa ctcaagactt ctgggaagtt caattaggaa 2 04 0 
taccacatcc cgcagggtta aaaaagaaaa aatcagtaac agtactggat gtgggtgatg 2100 
catatttttc agttccctta gatgaagact tcaggaagta tactgcattt accataccta 2160 
gtataaacaa tgagacacca gggattagat atcagtacaa tgtgcttcca cagggatgga 222 0 
aaggatcacc agcaatattc caaagtagca tgacaaaaat cttagagcct tttagaaaac 22 80 
aaaatccaga catagttatc tatcaataca tggatgattt gtatgtagga tctgacttag 234 0 
aaatagggca gcatagaaca aaaatagagg agctgagaca acatctgttg aggtggggac 24 00 
ttaccacacc agacaaaaaa catcagaaag aacctccatt cctttggatg ggttatgaac 24 60 
tccatcctga taaatggaca gtacagccta tagtgctgcc agaaaaagac agctggactg 252 0 
tcaatgacat acagaagtta gtggggaaat tgaattgggc aagtcagatt tacccaggga 2580 
ttaaagtaag gcaattatgt aaactcctta gaggaaccaa agcactaaca gaagtaatac 2 64 0 
cactaacaga agaagcagag ctagaactgg cagaaaacag agagattcta aaagaaccag 2 700 
tacatggagt gtattatgac ccatcaaaag acttaatagc agaaatacag aagcaggggc 2 760 
aaggccaatg gacatatcaa atttatcaag agccatttaa aaatctgaaa acaggaaaat 2 82 0 
atgcaagaat gaggggtgcc cacactaatg atgtaaaaca attaacagag gcagtgcaaa 2 880 
aaataaccac agaaagcata gtaatatggg gaaagactcc taaatttaaa ctgcccatac 2 940 
aaaaggaaac atgggaaaca tggtggacag agtattggca agccacctgg attcctgagt 3 00 0 
gggagtttgt taatacccct cccttagtga aattatggta ccagttagag aaagaaccca 3 060 
tagtaggagc agaaaccttc tatgtagatg gggcagctaa cagggagact aaattaggaa 312 0 
aagcaggata tgttactaat agaggaagac aaaaagttgt caccctaact gacacaacaa 3180 
atcagaagac tgagttacaa gcaatttatc tagctttgca ggattcggga ttagaagtaa 324 0 
acatagtaac agactcacaa tatgcattag gaatcattca agcacaacca gatcaaagtg 33 00 
aatcagagtt agtcaatcaa ataatagagc agttaataaa aaaggaaaag gtctatctgg 3 3 60 
catgggtacc agcacacaaa ggaattggag gaaatgaaca agtagataaa ttagtcagtg 342 0 
ctggaatcag gaaagtacta tttttagatg gaatagataa ggcccaagat gaacatgaga 34 8 0 
aatatcacag taattggaga gcaatggcta gtgattttaa cctgccacct gtagtagcaa 3 54 0 
aagaaatagt agccagctgt gataaatgtc agctaaaagg agaagccatg catggacaag 3 60 0 
tagactgtag tccaggaata tggcaactag attgtacaca tttagaagga aaagttatcc 3 660 
tggtagcagt tcatgtagcc agtggatata tagaagcaga agttattcca gcagaaacag 3 72 0 
ggcaggaaac agcatatttt cttttaaaat tagcaggaag atggccagta aaaacaatac 3 78 0 
atactgacaa tggcagcaat ttcaccggtg ctacggttag ggccgcctgt tggtgggcgg 3 84 0 
gaatcaagca ggaatttgga attccctaca atccccaaag tcaaggagta gtagaatcta 3 90 0 
tgaataaaga attaaagaaa attataggac aggtaagaga tcaggctgaa catcttaaga 3 960 
cagcagtaca aatggcagta ttcatccaca attttaaaag aaaagggggg attggggggt 4 02 0 
acagtgcagg ggaaagaata gtagacataa tagcaacaga catacaaact aaagaattac 40 80 
aaaaacaaat tacaaaaatt caaaattttc gggtttatta cagggacagc agaaattcac 414 0 
tttggaaagg accagcaaag ctcctctgga aaggtgaagg ggcagtagta atacaagata 42 00 
atagtgacat aaaagtagtg ccaagaagaa aagcaaagat cattagggat tatggaaaac 42 60 
agatggcagg tgatgattgt gtggcaagta gacaggatga ggattag 43 07 



<210> 2 
<211> 4307 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

gagpol-SYNgp-codon optimised gagpol sequence 

<400> 2 

atgggcgccc gcgccagcgt gctgtcgggc ggcgagctgg accgctggga gaagatccgc 60 
ctgcgccccg gcggcaaaaa gaagtacaag ctgaagcaca tcgtgtgggc cagccgcgaa 12 0 
ctggagcgct tcgccgtgaa ccccgggctc ctggagacca gcgaggggtg ccgccagatc 180 
ctcggccaac tgcagcccag cctgcaaacc ggcagcgagg agctgcgcag cctgtacaac 24 0 



3 



accgtggcca 
ctggataaaa 
gacaccggac 
cagatggtgc 
gagaaggctt 
ccccaagatc 
ctgaaggaga 
gggcccatcg 
agtacccttc 
atctacaaac 
agcatcctgg 
tacaaaacgc 
ctgctggtcc 
gctaccctag 
cgcgtcctgg 
ggcaactttc 
acagcccgca 
caccagatga 
tacaagggaa 
gagagcttca 
aaggaactgt 
taaagatagg 
tggaggagat 
tcatcaaggt 
gtaccgtgct 
tcggttgcac 
ccgggatgga 
tggtggagat 
acccgtacaa 
tggtggactt 
tcccgcaccc 
cctacttctc 
cgatcaacaa 
aaggctctcc 
agaaccccga 
agatagggca 
tgaccacacc 
tgcaccctga 
tcaacgacat 
ttaaggtgag 
ccctaaccga 
tgcacggcgt 
aaggccagtg 
acgcccggat 
agatcaccac 
agaaggaaac 
gggagttcgt 
tagtgggcgc 
aagccggata 
accagaagac 
acatcgtgac 
agtccgagct 
cctgggtacc 
ctggcatcag 
aataccacag 
aagagatcgt 
tggactgtag 
tggtagccgt 



cgctgtactg 
tcgaagagga 
acagcaacca 
accaggccat 
ttagcccgga 
tgaacaccat 
ccatcaatga 
caccgggcca 
aggaacagat 
gctggatcat 
acatccgcca 
tccgcgccga 
agaacgcgaa 
aggaaatgat 
ctgaggccat 
ggaaccaacg 
actgcagggc 
aagactgtac 
ggccagggaa 
ggtctggggt 
atcctttaac 
ggggcagctc 
gtcgttgcca 
gcgccagtat 
ggtgggcccc 
gctgaacttc 
cggcccgaag 
ttgcacagag 
cacgccggtg 
ccgcgagctg 
cgcagggctg 
cgttcccctg 
cgagacaccg 
cgcaatcttc 
catcgtcatc 
gcaccgcacc 
cgacaagaag 
caaatggacc 
acagaagctg 
gcagctgtgc 
ggaggccgag 
gtactatgac 
gacctatcag 
gaggggtgcc 
cgaaagcatc 
ctgggaaacc 
caacacccct 
cgaaaccttc 
cgtcactaac 
tgagctgcag 
agactctcag 
ggtcaatcag 
cgcccacaaa 
gaaggtgcta 
caactggcgg 
ggccagctgt 
ccccggcatc 
ccatgtggcc 



cgtccaccag 
acagaataag 
ggtcagccag 
ctccccccgc 
ggtgataccc 
gctcaacaca 
ggaggctgcc 
gatgcgtgag 
cggctggatg 
cctgggcctg 
aggcccgaag 
gcaggctagc 
cccggactgc 
gaccgcctgt 
gagccaggtg 
caagatcgtc 
ccctaggaaa 
tgagagacag 
ttttcttcag 
agagacaaca 
ttccctcaga 
aaggaggctc 
ggccgctgga 
gaccagatcc 
acacccgtca 
cccattagcc 
gtcaagcaat 
atggaaaagg 
ttcgcaatca 
aacaagcgca 
aagaagaaga 
gacgaagact 

gggattcgat 

cagagtagca 
tatcagtaca 
aagatcgagg 
caccagaagg 
gtgcagccta 
gtggggaagt 
aaactcctcc 
ctcgaactgg 
ccctccaagg 
atttaccagg 
cacactaacg 
gtgatctggg 
tggtggacag 
cccctggtga 
tacgtggatg 
cggggcagac 
gccatttacc 
tatgccctgg 
atcatcgagc 
ggcattggcg 
ttcctggatg 
gccatggcta 
gacaagtgtc 
tggcaactcg 
agtggctaca 



cgcatcgaaa 
agcaaaaaga 
aactacccca 
acgctgaacg 
atgttctcag 

gtggggggac 

gaatgggatc 
ccacggggct 
accaacaacc 
aacaagatcg 
gaaccctttc 
caggaggtga 
aagacgatcc 
cagggagtgg 
accaactccg 
aagtgcttca 
aagggctgct 
gctaattttt 
agcagaccag 
actccccctc 
tcactctttg 
tcctggacac 
agccgaagat 
tcatcgaaat 
acatcatcgg 
ctatcgagac 
ggccattgac 
aagggaaaat 
agaagaagga 
cgcaagactt 
aatccgtgac 
tcaggaagta 
atcagtacaa 
tgaccaaaat 
tggatgactt 
agctgcgcca 
agcctccctt 
tcgtgctgcc 
tgaactgggc 
gcggaaccaa 
cagaaaaccg 
acctgatcgc 
agcccttcaa 
acgtcaagca 
gaaagactcc 
agtattggca 
agctgtggta 
gggccgctaa 
agaaggttgt 
tcgctttgca 
gcatcattca 
agctgatcaa 
gcaatgagca 
gcatcgacaa 
gcgacttcaa 
agctcaaggg 
attgcaccca 
tcgaggccga 



tcaaggatac 
aggcccaaca 
tcgtgcagaa 
cctgggtgaa 
ccctgtcaga 
accaggccgc 
gtgtgcatcc 
cagacatcgc 
cacccatccc 
tgcgcatgta 
gcgactacgt 
agaactggat 
tgaaggccct 
gcggacccgg 
ctaccatcat 
actgtggcaa 
ggaaatgcgg 
tagggaagat 
agccaacagc 
agaagcagga 
gcaacgaccc 
cggagcagac 
gatcggggga 
ctgcggccac 
acgcaacctg 
ggtaccggtg 
agaggagaag 
ctccaagatt 
ctcgacgaaa 
ctgggaggtt 
cgtactggat 
cactgccttc 
cgtgctgccc 
cctggagcct 
gtacgtgggc 
gcacctgttg 
cctctggatg 
agagaaagac 
cagtcagatt 
ggcactcaca 
agagatccta 
cgagatccag 
gaacctgaag 
gctgaccgag 
taagttcaag 
ggccacctgg 
ccagctggag 
cagggagact 
caccctcact 
ggactcgggc 
agcccagcca 
gaaggaaaag 
ggtcgacaag 
ggcccaggac 
cctgccccct 
cgaagccatg 
tctggagggc 
ggtcattccc 



gaaagaggcc 
ggccgccgcg 
catccagggg 
ggtggtggaa 
gggagccacc 
catgcagatg 
ggtgcacgca 
cggaacgact 
ggtgggagaa 
tagccctacc 
ggaccggttc 
gaccgaaacc 
gggcccagcg 
ccacaaggca 
gatgcagcgc 
agaagggcac 
caaggaaggc 
ctggccttcc 
cccaccagaa 
gccgatagac 
ctcgtcacaa 
gacaccgtgc 
atcggcggtt 
aaggctatcg 
ttgacgcaga 
aagctgaagc 
atcaaggcac 
gggcctgaga 
tggcgcaagc 
cagctgggca 
gtgggtgatg 
acaatccctt 
cagggctgga 
ttccgcaaac 
tctgatctag 
aggtggggac 
ggttacgagc 
agctggactg 
tacccaggga 
gaggtgatcc 
aaggagcccg 
aagcaggggc 
accggcaagt 
gccgtgcaga 
ctgcccatcc 
attcctgagt 
aaggagccca 
aagctgggca 
gacaccacca 
ctggaggtga 
gaccagagtg 
gtctatctgg 
ctggtctcgg 
gagcacgaga 
gtggtggcca 
catggccagg 
aaggttatcc 
gccgaaacag 



300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 



4 



ggcaggagac agcctacttc ctcctgaagc tggcaggccg gtggccagtg aagaccatcc 3780 
atactgacaa tggcagcaat ttcaccagtg ctacggttaa ggccgcctgc tggtgggcgg 3 84 0 
gaatcaagca ggagttcggg atcccctaca atccccagag tcagggcgtc gtcgagtcta 3 900 
tgaataagga gttaaagaag attatcggcc aggtcagaga tcaggctgag catctcaaga 3 960 
ccgcggtcca aatggcggta ttcatccaca atttcaagcg gaaggggggg attggggggt 4 02 0 
acagtgcggg ggagcggatc gtggacatca tcgcgaccga catccagact aaggagctgc 408 0 
aaaagcagat taccaagatt cagaatttcc gggtctacta cagggacagc agaaatcccc 414 0 
tctggaaagg cccagcgaag ctcctctgga agggtgaggg ggcagtagtg atccaggata 42 00 
atagcgacat caaggtggtg cccagaagaa aggcgaagat cattagggat tatggcaaac 42 60 
agatggcggg tgatgattgc gtggcgagca gacaggatga ggattag 4307 



<210> 3 
<211> 2571 
<212> DMA 

<213> Human immunodeficiency virus type 1 



<400> 3 

atgagagtga 

cttgggttat 

gtacctgtgt 

gatacagagg 

caagaagtag 

gaacagatgc 

ttaaccccac 

aatagtactg 

aactgctctt 

ctttataaac 

tgtaatacct 

cactattgtg 

aaaggatcat 

tcaactcaac 

aatttcactg 

tgtacaagac 

tatacaacaa 

aaatggaatg 

acaatagtct 

tgtggagggg 

aataatactt 

aaacaaatta 

ggacaaatta 

gacacggaca 

tggagaagtg 

accaaggcaa 

cttgggttct 

caggccagac 

gaggcgcaac 

gtcctggctg 

ggaaaactca 

gatgatattt 

agcttaatat 

ttattggaat 

tggtatataa 

gctgtacttt 

cgccccccag 

agagacagag 

ctgcggagcc 

attgtggaac 



aggggatcag 
taatgatctg 
ggaaagaagc 
tacataatgt 
aattggtaaa 
atgaggatat 
tctgtgttac 
ctaataacaa 
tcaatatcac 
ttgatatagt 
cagtcattac 
ccccggctgg 
gtaaaaatgt 
tgctgttaaa 
ataatgctaa 
ccaactacaa 
aaaatataat 
acactttaag 
ttaatcaatc 
aatttttcta 
ggaataatac 
taaacatgtg 
gatgttcatc 
cgaacgacac 
aattatataa 
agagaagagt 
taggagcagc 
tattattgtc 
agcatatgtt 
tggaaagata 
tttgcaccac 
ggaataacat 
actcattact 
tggataaatg 
aaatattcat 
ctatagtgaa 
ttccgagggg 
acacatccgg 
tgttcctctt 
ttctgggacg 



gaggaattat 
tagtgctaca 
aaccaccact 
ttgggccaca 
tgtgacagaa 
aatcagttta 
tttaaattgc 
tagtaatagc 
cacaagcata 
atcaatagat 
acaagcttgt 
ttttgcgatt 
cagcacagta 
tggcagtcta 
aaccatcata 
taaaagaaaa 
aggaactata 
acagatagtt 
ctcaggaggg 
ctgtaataca 
tacagggtca 
gcaggaagta 
aaatattaca 
cgagatcttc 
atataaagta 
ggtgcagaga 
aggaagcact 
tggtatagtg 
gcaactcaca 
cctaaaggat 
tactgtgcct 
gacctggatg 
agaaaaatcg 
ggcaagtttg 
aatgatagta 
tagagttagg 
acccgacagg 
tcgattagtg 
cagctaccac 
cagggggtgg 



cagcactggt 
gaaaaattgt 
ctattttgtg 
caagcctgtg 
aattttaaca 
tgggatcaaa 
actgatttga 
gagggaacaa 
agagataaga 
aatgatagta 
ccaaagatat 
ctaaaatgta 
caatgtacac 
gcagaagaag 
gtacatctga 
aggatacata 
agacaagcac 
agcaaattaa 
gacccagaaa 
tcaccactgt 
aataacaata 
ggaaaagcaa 
gggctactat 
agacctggag 
gtaacaattg 
gaaaaaagag 
atgggcgcag 
caacagcaga 
gtctggggca 
caacagctcc 
tggaatgcta 
cagtgggaaa 
caaacccaac 
tggaattggt 
ggaggcttgg 
cagggatact 
cccgaaggaa 
catggattct 
cacagagact 
gaagtcctca 



ggggatgggg 
gggtcacagt 
catcagatgc 
tacccacaga 
tgtggaaaaa 
gcctaaagcc 
ggaatactac 
taaagggagg 
tgcagaaaga 
ccagctatag 
cctttgagcc 
acgataaaaa 
atggaattag 
aggtagtaat 
atgaatctgt 
taggaccagg 
attgtaacat 
aagaacaatt 
ttgtaatgca 
ttaatagtac 
tcacacttca 
tgtatgcccc 
taacaagaga 
gaggagatat 
aaccattagg 
cagcgatagg 
cgtcagtgac 
acaatttgct 
tcaagcagct 
tggggttttg 
gttggagtaa 
gagaaattga 
aagaaaagaa 
ttgacataac 
taggtttaag 
caccattgtc 
tcgaagaaga 
tagcaattat 
tactcttgat 
aatattggtg 



cacgatgctc 
ctattatggg 
taaagcatat 
ccccaaccca 
taacatggta 
atgtgtaaaa 
taataccaat 
agaaatgaaa 
atatgcactt 
gttgataagt 
aattcccata 
gttcagtgga 
gccagtagta 
tagatctgag 
acaaattaat 
gagagcattt 
tagtagagca 
taagaataaa 
cagttttaat 
ttggaatggt 
atgcaaaata 
tcccattgaa 
tggtggtaag 
gagggacaat 
agtagcaccc 
agctctgttc 
gctgacggta 
gagggccatt 
ccaggcaaga 
gggttgctct 
taaatctctg 
caattacaca 
tgaacaagaa 
aaattggctg 
aatagttttt 
gttgcagacc 
aggtggagag 
ctgggtcgac 
tgcagcgagg 
gaatctccta 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 



cagtattgga gtcaggaact aaagagtagt gctgttagct tgcttaatgc cacagctata 24 6 0 
gcagtagctg aggggacaga tagggttata gaagtactgc aaagagctgg tagagctatt 2 52 0 
ctccacatac ctacaagaat aagacagggc ttggaaaggg ctttgctata a 2571 



<210> 4 
<211> 2571 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

SYNgp-160mn-codon optimised env sequence 

<400> 4 

atgagggtga aggggatccg ccgcaactac cagcactggt ggggctgggg cacgatgctc 6 0 
ctggggctgc tgatgatctg cagcgccacc gagaagctgt gggtgaccgt gtactacggc 12 0 
gtgcccgtgt ggaaggaggc caccaccacc ctgttctgcg ccagcgacgc caaggcgtac 180 
gacaccgagg tgcacaacgt gtgggccacc caggcgtgcg tgcccaccga ccccaacccc 24 0 
caggaggtgg agctcgtgaa cgtgaccgag aacttcaaca tgtggaagaa caacatggtg 3 00 
gagcagatgc atgaggacat catcagcctg tgggaccaga gcctgaagcc ctgcgtgaag 3 60 
ctgacccccc tgtgcgtgac cctgaactgc accgacctga ggaacaccac caacaccaac 42 0 
aacagcaccg ccaacaacaa cagcaacagc gagggcacca tcaagggcgg cgagatgaag 48 0 
aactgcagct tcaacatcac caccagcatc cgcgacaaga tgcagaagga gtacgccctg 54 0 
ctgtacaagc tggatatcgt gagcatcgac aacgacagca ccagctaccg cctgatctcc 60 0 
tgcaacacca gcgtgatcac ccaggcctgc cccaagatca gcttcgagcc catccccatc 660 
cactactgcg cccccgccgg cttcgccatc ctgaagtgca acgacaagaa gttcagcggc 72 0 
aagggcagct gcaagaacgt gagcaccgtg cagtgcaccc acggcatccg gccggtggtg 7 80 
agcacccagc tcctgctgaa cggcagcctg gccgaggagg aggtggtgat ccgcagcgag 840 
aacttcaccg acaacgccaa gaccatcatc gtgcacctga atgagagcgt gcagatcaac 900 
tgcacgcgtc ccaactacaa caagcgcaag cgcatccaca tcggccccgg gcgcgccttc 960 
tacaccacca agaacatcat cggcaccatc cgccaggccc actgcaacat ctctagagcc 102 0 
aagtggaacg acaccctgcg ccagatcgtg agcaagctga aggagcagtt caagaacaag 10 80 
accatcgtgt tcaaccagag cagcggcggc gaccccgaga tcgtgatgca cagcttcaac 1140 
tgcggcggcg aattcttcta ctgcaacacc agccccctgt tcaacagcac ctggaacggc 12 0 0 
aacaacacct ggaacaacac caccggcagc aacaacaata ttaccctcca gtgcaagatc 12 60 
aagcagatca tcaacatgtg gcaggaggtg ggcaaggcca tgtacgcccc ccccatcgag 132 0 
ggccagatcc ggtgcagcag caacatcacc ggtctgctgc tgacccgcga cggcggcaag 13 80 
gacaccgaca ccaacgacac cgaaatcttc cgccccggcg gcggcgacat gcgcgacaac 1440 
tggagatctg agctgtacaa gtacaaggtg gtgacgatcg agcccctggg cgtggccccc 1500 
accaaggcca agcgccgcgt ggtgcagcgc gagaagcggg ccgccatcgg cgccctgttc 1560 
ctgggcttcc tgggggcggc gggcagcacc atgggggccg ccagcgtgac cctgaccgtg 162 0 
caggcccgcc tgctcctgag cggcatcgtg cagcagcaga acaacctcct ccgcgccatc 1680 
gaggcccagc agcatatgct ccagctcacc gtgtggggca tcaagcagct ccaggcccgc 174 0 
gtgctggccg tggagcgcta cctgaaggac cagcagctcc tgggcttctg gggctgctcc 1800 
ggcaagctga tctgcaccac cacggtaccc tggaacgcct cctggagcaa caagagcctg 1860 
gacgacatct ggaacaacat gacctggatg cagtgggagc gcgagatcga taactacacc 192 0 
agcctgatct acagcctgct ggagaagagc cagacccagc aggagaagaa cgagcaggag 1980 
ctgctggagc tggacaagtg ggcgagcctg tggaactggt tcgacatcac caactggctg 2 04 0 
tggtacatca aaatcttcat catgattgtg ggcggcctgg tgggcctccg catcgtgttc 2100 
gccgtgctga gcatcgtgaa ccgcgtgcgc cagggctaca gccccctgag cctccagacc 2160 
cggccccccg tgccgcgcgg gcccgaccgc cccgagggca tcgaggagga gggcggcgag 2220 
cgcgaccgcg acaccagcgg caggctcgtg cacggcttcc tggcgatcat ctgggtcgac 22 8 0 
ctccgcagcc tgttcctgtt cagctaccac caccgcgacc tgctgctgat cgccgcccgc 2340 
atcgtggaac tcctaggccg ccgcggctgg gaggtgctga agtactggtg gaacctcctc 2400 
cagtattgga gccaggagct gaagtccagc gccgtgagcc tgctgaacgc caccgccatc 2460 
gccgtggccg agggcaccga ccgcgtgatc gaggtgctcc agagggccgg gagggcgatc 252 0 
ctgcacatcc ccacccgcat ccgccagggg ctcgagaggg cgctgctgta a 2571 



6 



<210> 5 
<211> 116 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 5 

tcgagcccgg ggatgacgtc atcgacttcg aaggttcgaa tccttctact gccaccattt 60 
tttctctacg tcatcgactt cgaaggttcg aatccttccc tgtccaccag tcgacc 116 



<210> 6 
<211> 110 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 6 

tcgagtatta cgtcatcgac ttcgaaggtt cgaatccttc tagattcacc attttttagg 60 
aacgtcatcg acttcgaagg ttcgaatcct tccagttcca ccagtcgacc 110 



<210> 7 
<211> 110 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 7 

tcgaggccaa cgtcatcgac ttcgaaggtt cgaatccttc tcttcccacc attttttttc 60 
cacgtcatcg acttcgaagg ttcgaatcct tcggggccca ccagtcgacc 110 



<210> 8 
<211> 110 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 8 

tcgagggcta cgtcatcgac ttcgaaggtt cgaatccttc ttgcttcacc attttttctg 60 
aacgtcatcg acttcgaagg ttcgaatcct tctgctgtca ccagtcgacc 110 



<210> 9 

<211> 110 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 9 

tcgagtataa cgtcatcgac ttcgaaggtt cgaatccttc accggtcacc atttttttat 60 
aacgtcatcg acttcgaagg ttcgaatcct tcttcttaca ccagtcgacc 110 



<210> 10 
<211> 116 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 10 

tcgaggtaca cgtcatcgac ttcgaaggtt cgaatccttc gtagttcacc attttttgtg 60 
cacgtcatcg acttcgaagg ttcgaatcct tctaggccca ccagtcgacg catgcc 116 



<210> 11 
<211> 8560 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
nucleotide pH4DOZENEGS sequence 



<400> 11 

ctgacgcgcc 

ccgctacact 

ccacgttcgc 

ttagtgcttt 

ggccatcgcc 

gtggactctt 

tataagggat 

ttaacgcgaa 

caactgttgg 

gggatgtgct 

taaaacgacg 

ccgcggtggc 

gaccgcccaa 

caatagggac 

cagtacatca 

ggcccgcctg 

tctacgtatt 

gtggatagcg 

gtttgttttg 

tgacgcaaat 



ctgtagcggc 
tgccagcgcc 
cggctttccc 
acggcacctc 
ctgatagacg 
gttccaaact 
tttgccgatt 
ttttaacaaa 
gaagggcgat 
gcaaggcgat 
gccagtgagc 
ggccgctcta 
cgacccccgc 
tttccattga 
agtgtatcat 
gcattatgcc 
agtcatcgct 
gtttgactca 
gcaccaaaat 
gggcggtagg 



gcattaagcg 
ctagcgcccg 
cgtcaagctc 
gaccccaaaa 
gtttttcgcc 
ggaacaacac 
tcggcctatt 
atattaacgc 
cggtgcgggc 
taagttgggt 
gcgcgtaata 
gagtccgtta 
ccattgacgt 
cgtcaatggg 
atgccaagta 
cagtacatga 
attaccatgg 
cggggatttc 
caacgggact 
cgtgtacggt 



cggcgggtgt 
ctcctttcgc 
taaatcgggg 
aacttgatta 
ctttgacgtt 
tcaaccctat 
ggttaaaaaa 
ttacaatttc 
ctcttcgcta 
aacgccaggg 
cgactcacta 
cataacttac 
caataatgac 
tggagtattt 
cgccccctat 
ccttatggga 
tgatgcggtt 
caagtctcca 
ttccaaaatg 
gggaggtcta 



ggtggttacg 
tttcttccct 
gctcccttta 
gggtgatggt 
ggagtccacg 
ctcggtctat 
tgagctgatt 
cattcgccat 
ttacgccagc 
ttttcccagt 
tagggcgaat 
ggtaaatggc 
gtatgttccc 
acggtaaact 
tgacgtcaat 
ctttcctact 
ttggcagtac 
ccccattgac 
tcgtaacaac 
tataagcaga 



cgcagcgtga 
tcctttctcg 
gggttccgat 
tcacgtagtg 
ttctttaata 
tcttttgatt 
taacaaaaat 
tcaggctgcg 
tggcgaaagg 
cacgacgttg 
tggagctcca 
ccgcctggct 
atagtaacgc 
gcccacttgg 
gacggtaaat 
tggcagtaca 
atcaatgggc 
gtcaatggga 
tccgccccat 
gctcgtttag 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 



tgaaccggtc tctctggtta gaccagatct 
acccactgct taagcctcaa taaagcttgc 
tgttgtgtga ctctggtaac tagagatccc 
ctagcagtgg cgcccgaaca gggacttgaa 
acgcaggact cggcttgctg aagcgcgcac 
tacgccaaaa attttgacta gcggaggcta 
tattaagcgg gggagaatta gatcgcgatg 
gaaaaaatat aaattaaaac atatagtatg 
taatcctggc ctgttagaaa catcagaagg 
atcccttcag acaggatcag aagaacttag 
ttgtgtgcat caaaggttga gataaaagac 
gagcaaaaca aaagtaagaa aaaagcacag 
caggtcagcc aaaattaccc tatagtgcag 
atatcaccta gaactttaaa tgcatgggta 
gaagtgatac ccatgttttc agcattatca 
atgctaaaca cagtgggggg acatcaagca 
gaggaagctg caggaattcg cctaaaactg 
gctttcattg ccaagtttgt ttcataacaa 
agcggagaca gcgacgaaga gctcatcaga 
agcagtaagt agtacatgta acgcaaccta 
tagcaataat aatagcaata gttgtgtggt 
taagacaaag aaaaatagac aggttaattg 
gcaatgagag tgaaggagaa atatcagcac 
atgctccttg ggatgttgat gatctgtagt 
tatggggtac ctgtgtggaa ggaagcaacc 
gcatagatct tcagacttgg aggaggagat 
aaatataaag tagtaaaaat tgaaccatta 
gtggtgcaga gagaaaaaag agcagtggga 
gcagcaggaa gcactatggg cgcagcgtca 
ttgtctggta tagtgcagca gcagaacaat 
ctgttgcaac tcacagtctg gggcatcaag 
agatacctaa aggatcaaca gctcctgggg 
accactgctg tgccttggaa tgctagttgg 
cacacgacct ggatggagtg ggacagagaa 
ttaattgaag aatcgcaaaa ccagcaagaa 
aaatgggcaa gtttgtggaa ttggtttaac 
ttcataatga tagtaggagg cttggtaggt 
gtgaatagag ttaggcaggg atattcacca 
aggggacccg acaggcccga aggaatagaa 
tccattcgat tagtgaacgg atccttggca 
ctcttcagct accaccgctt gagagactta 
ctgggacgca gggggtggga agccctcaaa 
caggaactaa agaatagtgc tgttagcttg 
gggacagata gggttataga agtagtacaa 
agaagaataa gacagggctt ggaaaggatt 
aagtagtgtg attggatggc ctactgtaag 
agatagggtg ggagcagcat ctcgacgctg 
gtcgaggcgg atccggccat tagccatatt 
tggctattgg ccattgcata cgttgtatcc 
atgtccaaca ttaccgccat gttgacattg 
tacggggtca ttagttcata gcccatatat 
tggcccgcct ggctgaccgc ccaacgaccc 
tcccatagta acgccaatag ggactttcca 
aactgcccac ttggcagtac atcaagtgta 
caatgacggt aaatggcccg cctggcatta 
tacttggcag tacatctacg tattagtcat 
gtacatcaat gggcgtggat agcggtttga 
tgacgtcaat gggagtttgt tttggcacca 



gagcctggga gctctctggc taactaggga 1260 
cttgagtgct tcaagtagtg tgtgcccgtc 1320 
tcagaccctt ttagtcagtg tggaaaatct 13 80 
agcgaaaggg aaaccagagg agctctctcg 1440 
ggcaagaggc gaggggcggc gactggtgag 15 00 
gaaggagaga gatgggtgcg agagcgtcag 15 60 
ggaaaaaatt cggttaaggc cagggggaaa 162 0 
ggcaagcagg gagctagaac gattcgcagt 1680 
ctgtagacaa atactgggac agctacaacc 1740 
atcattatat aatacagtag caaccctcta 1800 
accaaggaag ctttagacaa gatagaggga 1860 
caagcagcag ctgacacagg acacagcaat 192 0 
aacatccagg ggcaaatggt acatcaggcc 1980 
aaagtagtag aagagaaggc tttcagccca 2 04 0 
gaaggagcca ccccacaaga tttaaacacc 2100 
gccatgcaaa tgttaaaaga gaccatcaat 2160 
cttgtaccaa ttgctattgt aaaaagtgtt 2220 
aagccttagg catctcctat ggcaggaaga 2280 
acagtcagac tcatcaagct tctctatcaa 2340 
taccaatagt agcaatagta gcattagtag 24 00 
ccatagtaat catagaatat aggaaaatat 24 60 
atagactaat agaaagagca gaagacagtg 2 52 0 
ttgtggagat gggggtggag atggggcacc 2580 
gctacagaaa aattgtgggt cacagtctat 2 64 0 
accactctat tttgtgcatc agatgctaaa 2700 
atgagggaca attggagaag tgaattatat 2 76 0 
ggagtagcac ccaccaaggc aaagagaaga 2 82 0 
ataggagctt tgttccttgg gttcttggga 2880 
atgacgctga cggtacaggc cagacaatta 2 94 0 
ttgctgaggg ctattgaggc gcaacagcat 3 000 
cagctccagg caagaatcct ggctgtggaa 3 060 
atttggggtt gctctggaaa actcatttgc 312 0 
agtaataaat ctctggaaca gatctggaat 318 0 
attaacaatt acacaagctt aatacactcc 3 24 0 
aagaatgaac aagaattatt ggaattagat 33 00 
ataacaaatt ggctgtggta tataaaatta 33 60 
ttaagaatag tttttgctgt actttctata 3420 
ttatcgtttc agacccacct cccaaccccg 3480 
gaagaaggtg gagagagaga cagagacaga 354 0 
cttatctggg acgatctgcg gagcctgtgc 3600 
ctcttgattg taacgaggat tgtggaactt 3660 
tattggtgga atctcctaca gtattggagt 3720 
ctcaatgcca cagccatagc agtagctgag 3780 
ggagcttgta gagctattcg ccacatacct 3 840 
ttgctataag atgggtggca agtggtcaaa 3 90 0 
ggaaagaatg agacgagctg agccagcagc 3 96 0 
caggagtggg gaggcacgat ggccgctttg 4 02 0 
attcattggt tatatagcat aaatcaatat 4080 
atatcataat atgtacattt atattggctc 4140 
attattgact agttattaat agtaatcaat 4200 
ggagttccgc gttacataac ttacggtaaa 42 60 
ccgcccattg acgtcaataa tgacgtatgt 43 2 0 
ttgacgtcaa tgggtggagt atttacggta 43 80 
tcatatgcca agtacgcccc ctattgacgt 4440 
tgcccagtac atgaccttat gggactttcc 4500 
cgctattacc atggtgatgc ggttttggca 4560 
ctcacgggga tttccaagtc tccaccccat 4620 
aaatcaacgg gactttccaa aatgtcgtaa 4680 



caactccgcc ccattgacgc aaatgggcgg 
cagagctcgt ttagtgaacc gtcagatcgc 
ccatagaaga caccgggacc gatccagcct 
cggggatgac gtcatcgact tcgaaggttc 
acgtcatcga cttcgaaggt tcgaatcctt 
gacttcgaag gttcgaatcc ttctagattc 
aggttcgaat ccttccagtt ccaccagtcg 
atccttctct tcccaccatt ttttttccac 
gggcccacca gtcgagggct acgtcatcga 
cattttttct gaacgtcatc gacttcgaag 
tataacgtca tcgacttcga aggttcgaat 
catcgacttc gaaggttcga atccttcttc 
tcgaaggttc gaatccttcg tagttcacca 
tcgaatcctt ctaggcccac cagtcgacgc 
gacctagaaa aacatggagc aatcacaagt 
gcctggctag aagcacaaga ggaggaggag 
ttaagaccaa tgacttacaa ggcagctgta 
ggactggaag ggctaattca ctcccaacga 
cacacacaag gctacttccc tgattggcag 
ccactgacct ttggatggtg ctacaagcta 
gccaatgaag gagagaacac ccgcttgtta 
ccggagagag aagtattaga gtggaggttt 
cgagagctgc atccggagta cttcaagaac 
tccgctgggg actttccagg gaggcgtggc 
agatgctgca tataagcagc tgctttttgc 
ctgagcctgg gagctctctg gctaactagg 
gccttgagtg cttcaagtag tgtgtgcccg 
cctcagaccc ttttagtcag tgtggaaaat 
agcttttgtt ccctttagtg agggttaatt 
tttcctgtgt gaaattgtta tccgctcaca 
aagtgtaaag cctggggtgc ctaatgagtg 
ctgcccgctt tccagtcggg aaacctgtcg 
gcggggagag gcggtttgcg tattgggcgc 
cgctcggtcg ttcggctgcg gcgagcggta 
tccacagaat caggggataa cgcaggaaag 
aggaaccgta aaaaggccgc gttgctggcg 
catcacaaaa atcgacgctc aagtcagagg 
caggcgtttc cccctggaag ctccctcgtg 
ggatacctgt ccgcctttct cccttcggga 
aggtatctca gttcggtgta ggtcgttcgc 
gttcagcccg accgctgcgc cttatccggt 
cacgacttat cgccactggc agcagccact 
ggcggtgcta cagagttctt gaagtggtgg 
tttggtatct gcgctctgct gaagccagtt 
tccggcaaac aaaccaccgc tggtagcggt 
cgcagaaaaa aaggatctca agaagatcct 
tggaacgaaa actcacgtta agggattttg 
tagatccttt taaattaaaa atgaagtttt 
tggtctgaca gttaccaatg cttaatcagt 
cgttcatcca tagttgcctg actccccgtc 
ccatctggcc ccagtgctgc aatgataccg 
tcagcaataa accagccagc cggaagggcc 
gcctccatcc agtctattaa ttgttgccgg 
agtttgcgca acgttgttgc cattgctaca 
atggcttcat tcagctccgg ttcccaacga 
tgcaaaaaag cggttagctc cttcggtcct 
gtgttatcac tcatggttat ggcagcactg 
agatgctttt ctgtgactgg tgagtactca 



taggcatgta cggtgggagg tctatataag 4740 
ctggagacgc catccacgct gttttgacct 4800 
ccgcggcccc aagcttcagc tgctcgagcc 4860 
gaatccttct actgccacca ttttttctct 4920 
ccctgtccac cagtcgagta ttacgtcatc 4980 
accatttttt aggaacgtca tcgacttcga 5040 
aggccaacgt catcgacttc gaaggttcga 5100 
gtcatcgact tcgaaggttc gaatccttcg 5160 
cttcgaaggt tcgaatcctt cttgcttcac 5220 
gttcgaatcc ttctgctgtc accagtcgag 5280 
ccttcaccgg tcaccatttt tttataacgt 5340 
ttacaccagt cgaggtacac gtcatcgact 54 00 
ttttttgtgc acgtcatcga cttcgaaggt 5460 
atgcctgcag gtcgaggtcg ataccgtcga 5520 
agcaatacag cagctaccaa tgctgattgt 5 580 
gtgggttttc cagtcacacc tcaggtacct 564 0 
gatcttagcc actttttaaa agaaaagggg 5700 
agacaagata tccttgatct gtggatctac 5760 
aactacacac cagggccagg gatcagatat 582 0 
gtaccagttg agcaagagaa ggtagaagaa 5880 
caccctgtga gcctgcatgg gatggatgac 5940 
gacagccgcc tagcatttca tcacatggcc 6000 
tgctgacatc gagcttgcta caagggactt 6060 
ctgggcggga ctggggagtg gcgagccctc 612 0 
ctgtactggg tctctctggt tagaccagat 6180 
gaacccactg cttaagcctc aataaagctt 6240 
tctgttgtgt gactctggta actagagatc 6300 
ctctagcagt cgaggggggg cccggtaccc 63 60 
gcgcgcttgg cgtaatcatg gtcatagctg 642 0 
attccacaca acatacgagc cggaagcata 648 0 
agctaactca cattaattgc gttgcgctca 6540 
tgccagctgc attaatgaat cggccaacgc 6600 
tcttccgctt cctcgctcac tgactcgctg 6660 
tcagctcact caaaggcggt aatacggtta 6720 
aacatgtgag caaaaggcca gcaaaaggcc 67 8 0 
tttttccata ggctccgccc ccctgacgag 6840 
tggcgaaacc cgacaggact ataaagatac 690 0 
cgctctcctg ttccgaccct gccgcttacc 6960 
agcgtggcgc tttctcatag ctcacgctgt 702 0 
tccaagctgg gctgtgtgca cgaacccccc 7080 
aactatcgtc ttgagtccaa cccggtaaga 714 0 
ggtaacagga ttagcagagc gaggtatgta 72 0 0 
cctaactacg gctacactag aaggacagta 7260 
accttcggaa aaagagttgg tagctcttga 7320 
ggtttttttg tttgcaagca gcagattacg 7380 
ttgatctttt ctacggggtc tgacgctcag 744 0 
gtcatgagat tatcaaaaag gatcttcacc 7500 
aaatcaatct aaagtatata tgagtaaact 7560 
gaggcaccta tctcagcgat ctgtctattt 7620 
gtgtagataa ctacgatacg ggagggctta 7680 
cgagacccac gctcaccggc tccagattta 7740 
gagcgcagaa gtggtcctgc aactttatcc 7800 
gaagctagag taagtagttc gccagttaat 7860 
ggcatcgtgg tgtcacgctc gtcgtttggt 792 0 
tcaaggcgag ttacatgatc ccccatgttg 7980 
ccgatcgttg tcagaagtaa gttggccgca 804 0 
cataattctc ttactgtcat gccatccgta 8100 
accaagtcat tctgagaata gtgtatgcgg 8160 
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cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 8220 
ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 8280 
ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 8340 
actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 8400 
ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 8460 
atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 8520 
caaatagggg ttccgcgcac atttccccga aaagtgccac 8560 



<210> 12 
<211> 4642 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pSYNGP2 -codon 
optimised HIV-1 gagpol with leader sequence 

<400> 12 

gggtctctct ggttagacca gatctgagcc tgggagctct ctggctaact agggaaccca 60 
ctgcttaagc ctcaataaag cttgccttga gtgcttcaag tagtgtgtgc ccgtctgttg 12 0 
tgtgactctg gtaactagag atccctcaga cccttttagt cagtgtggaa aatctctagc 180 
agtggcgccc gaacagggac ctgaaagcga aagggaaacc agagctctct cgacgcagga 24 0 
ctcggcttgc tgaagcgccc gcacggcaag aggcgagggg cggcgactgg tgagtacgcc 3 00 
aaaaattttg actagcggag gctagaagga gagagatggg cgcccgcgcc agcgtgctgt 3 60 
cgggcggcga gctggaccgc tgggagaaga tccgcctgcg ccccggcggc aaaaagaagt 42 0 
acaagctgaa gcacatcgtg tgggccagcc gcgaactgga gcgcttcgcc gtgaaccccg 480 
ggctcctgga gaccagcgag gggtgccgcc agatcctcgg ccaactgcag cccagcctgc 54 0 
aaaccggcag cgaggagctg cgcagcctgt acaacaccgt ggccacgctg tactgcgtcc 60 0 
accagcgcat cgaaatcaag gatacgaaag aggccctgga taaaatcgaa gaggaacaga 66 0 
ataagagcaa aaagaaggcc caacaggccg ccgcggacac cggacacagc aaccaggtca 72 0 
gccagaacta ccccatcgtg cagaacatcc aggggcagat ggtgcaccag gccatctccc 780 
cccgcacgct gaacgcctgg gtgaaggtgg tggaagagaa ggcttttagc ccggaggtga 84 0 
tacccatgtt ctcagccctg tcagagggag ccacccccca agatctgaac accatgctca 900 
acacagtggg gggacaccag gccgccatgc agatgctgaa ggagaccatc aatgaggagg 960 
ctgccgaatg ggatcgtgtg catccggtgc acgcagggcc catcgcaccg ggccagatgc 102 0 
gtgagccacg gggctcagac atcgccggaa cgactagtac ccttcaggaa cagatcggct 1080 
ggatgaccaa caacccaccc atcccggtgg gagaaatcta caaacgctgg atcatcctgg 1140 
gcctgaacaa gatcgtgcgc atgtatagcc ctaccagcat cctggacatc cgccaaggcc 12 0 0 
cgaaggaacc ctttcgcgac tacgtggacc ggttctacaa aacgctccgc gccgagcagg 12 6 0 
ctagccagga ggtgaagaac tggatgaccg aaaccctgct ggtccagaac gcgaacccgg 132 0 
actgcaagac gatcctgaag gccctgggcc cagcggctac cctagaggaa atgatgaccg 13 8 0 
cctgtcaggg agtgggcgga cccggccaca aggcacgcgt cctggctgag gccatgagcc 144 0 
aggtgaccaa ctccgctacc atcatgatgc agcgcggcaa ctttcggaac caacgcaaga 15 0 0 
tcgtcaagtg cttcaactgt ggcaaagaag ggcacacagc ccgcaactgc agggccccta 1560 
ggaaaaaggg ctgttggaaa tgtggaaagg aaggacacca aatgaaagat tgtactgaga 162 0 
gacaggctaa ttttttaggg aagatctggc cttcccacaa gggaaggcca gggaattttc 1680 
ttcagagcag accagagcca acagccccac cagaagagag cttcaggttt ggggaagaga 174 0 
caacaactcc ctctcagaag caggagccga tagacaagga actgtatcct ttagcttccc 1800 
tcagatcact ctttggcagc gacccctcgt cacaataaag ataggggggc agctcaagga 1860 
ggctctcctg gacaccggag cagacgacac cgtgctggag gagatgtcgt tgccaggccg 192 0 
ctggaagccg aagatgatcg ggggaatcgg cggtttcatc aaggtgcgcc agtatgacca 1980 
gatcctcatc gaaatctgcg gccacaaggc tatcggtacc gtgctggtgg gccccacacc 2 04 0 
cgtcaacatc atcggacgca acctgttgac gcagatcggt tgcacgctga acttccccat 2100 
tagccctatc gagacggtac cggtgaagct gaagcccggg atggacggcc cgaaggtcaa 216 0 
gcaatggcca ttgacagagg agaagatcaa ggcactggtg gagatttgca cagagatgga 2 22 0 
aaaggaaggg aaaatctcca agattgggcc tgagaacccg tacaacacgc cggtgttcgc 2280 
aatcaagaag aaggactcga cgaaatggcg caagctggtg gacttccgcg agctgaacaa 234 0 
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gcgcacgcaa gacttctggg aggttcagct gggcatcccg caccccgcag ggctgaagaa 2400 
gaagaaatcc gtgaccgtac tggatgtggg tgatgcctac ttctccgttc ccctggacga 2460 
agacttcagg aagtacactg ccttcacaat cccttcgatc aacaacgaga caccggggat 2520 
tcgatatcag tacaacgtgc tgccccaggg ctggaaaggc tctcccgcaa tcttccagag 2580 
tagcatgacc aaaatcctgg agcctttccg caaacagaac cccgacatcg tcatctatca 2640 
gtacatggat gacttgtacg tgggctctga tctagagata gggcagcacc gcaccaagat 2700 
cgaggagctg cgccagcacc tgttgaggtg gggactgacc acacccgaca agaagcacca 2760 
gaaggagcct cccttcctct ggatgggtta cgagctgcac cctgacaaat ggaccgtgca 2 82 0 
gcctatcgtg ctgccagaga aagacagctg gactgtcaac gacatacaga agctggtggg 2 880 
gaagttgaac tgggccagtc agatttaccc agggattaag gtgaggcagc tgtgcaaact 2 94 0 
cctccgcgga accaaggcac tcacagaggt gatcccccta accgaggagg ccgagctcga 3 0 00 
actggcagaa aaccgagaga tcctaaagga gcccgtgcac ggcgtgtact atgacccctc 3 060 
caaggacctg atcgccgaga tccagaagca ggggcaaggc cagtggacct atcagattta 312 0 
ccaggagccc ttcaagaacc tgaagaccgg caagtacgcc cggatgaggg gtgcccacac 3180 
taacgacgtc aagcagctga ccgaggccgt gcagaagatc accaccgaaa gcatcgtgat 3240 
ctggggaaag actcctaagt tcaagctgcc catccagaag gaaacctggg aaacctggtg 3300 
gacagagtat tggcaggcca cctggattcc tgagtgggag ttcgtcaaca cccctcccct 3360 
ggtgaagctg tggtaccagc tggagaagga gcccatagtg ggcgccgaaa ccttctacgt 3420 
ggatggggcc gctaacaggg agactaagct gggcaaagcc ggatacgtca ctaaccgggg 3480 
cagacagaag gttgtcaccc tcactgacac caccaaccag aagactgagc tgcaggccat 3540 
ttacctcgct ttgcaggact cgggcctgga ggtgaacatc gtgacagact ctcagtatgc 3600 
cctgggcatc attcaagccc agccagacca gagtgagtcc gagctggtca atcagatcat 3660 
cgagcagctg atcaagaagg aaaaggtcta tctggcctgg gtacccgccc acaaaggcat 3 72 0 
tggcggcaat gagcaggtcg acaagctggt ctcggctggc atcaggaagg tgctattcct 37 80 
ggatggcatc gacaaggccc aggacgagca cgagaaatac cacagcaact ggcgggccat 3 84 0 
ggctagcgac ttcaacctgc cccctgtggt ggccaaagag atcgtggcca gctgtgacaa 3 900 
gtgtcagctc aagggcgaag ccatgcatgg ccaggtggac tgtagccccg gcatctggca 3 960 
actcgattgc acccatctgg agggcaaggt tatcctggta gccgtccatg tggccagtgg 4020 
ctacatcgag gccgaggtca ttcccgccga aacagggcag gagacagcct acttcctcct 4080 
gaagctggca ggccggtggc cagtgaagac catccatact gacaatggca gcaatttcac 414 0 
cagtgctacg gttaaggccg cctgctggtg ggcgggaatc aagcaggagt tcgggatccc 42 0 0 
ctacaatccc cagagtcagg gcgtcgtcga gtctatgaat aaggagttaa agaagattat 42 60 
cggccaggtc agagatcagg ctgagcatct caagaccgcg gtccaaatgg cggtattcat 432 0 
ccacaatttc aagcggaagg gggggattgg ggggtacagt gcgggggagc ggatcgtgga 4 3 80 
catcatcgcg accgacatcc agactaagga gctgcaaaag cagattacca agattcagaa 444 0 
tttccgggtc tactacaggg acagcagaaa tcccctctgg aaaggcccag cgaagctcct 4500 
ctggaagggt gagggggcag tagtgatcca ggataatagc gacatcaagg tggtgcccag 4560 
aagaaaggcg aagatcatta gggattatgg caaacagatg gcgggtgatg attgcgtggc 462 0 
gagcagacag gatgaggatt ag 464 2 



<210> 13 
<211> 4353 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pSYNGP3 -codon 
optimised HIV-1 gagpol with leader sequence from 
the major splice donor 

<400> 13 

gtgagtacgc caaaaatttt gactagcgga ggctagaagg agagagatgg gcgcccgcgc 60 
cagcgtgctg tcgggcggcg agctggaccg ctgggagaag atccgcctgc gccccggcgg 12 0 
caaaaagaag tacaagctga agcacatcgt gtgggccagc cgcgaactgg agcgcttcgc 180 
cgtgaacccc gggctcctgg agaccagcga ggggtgccgc cagatcctcg gccaactgca 24 0 
gcccagcctg caaaccggca gcgaggagct gcgcagcctg tacaacaccg tggccacgct 300 
gtactgcgtc caccagcgca tcgaaatcaa ggatacgaaa gaggccctgg ataaaatcga 360 
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agaggaacag aataagagca aaaagaaggc 
caaccaggtc agccagaact accccatcgt 
ggccatctcc ccccgcacgc tgaacgcctg 
cccggaggtg atacccatgt tctcagccct 
caccatgctc aacacagtgg ggggacacca 
caatgaggag gctgccgaat gggatcgtgt 
gggccagatg cgtgagccac ggggctcaga 
acagatcggc tggatgacca acaacccacc 
gatcatcctg ggcctgaaca agatcgtgcg 
ccgccaaggc ccgaaggaac cctttcgcga 
cgccgagcag gctagccagg aggtgaagaa 
cgcgaacccg gactgcaaga cgatcctgaa 
aatgatgacc gcctgtcagg gagtgggcgg 
ggccatgagc caggtgacca actccgctac 
ccaacgcaag atcgtcaagt gcttcaactg 
cagggcccct aggaaaaagg gctgttggaa 
ttgtactgag agacaggcta attttttagg 
agggaatttt cttcagagca gaccagagcc 
tggggaagag acaacaactc cctctcagaa 
tttagcttcc ctcagatcac tctttggcag 
cagctcaagg aggctctcct ggacaccgga 
ttgccaggcc gctggaagcc gaagatgatc 
cagtatgacc agatcctcat cgaaatctgc 
ggccccacac ccgtcaacat catcggacgc 
aacttcccca ttagccctat cgagacggta 
ccgaaggtca agcaatggcc attgacagag 
acagagatgg aaaaggaagg gaaaatctcc 
ccggtgttcg caatcaagaa gaaggactcg 
gagctgaaca agcgcacgca agacttctgg 
gggctgaaga agaagaaatc cgtgaccgta 
cccctggacg aagacttcag gaagtacact 
acaccgggga ttcgatatca gtacaacgtg 
atcttccaga gtagcatgac caaaatcctg 
gtcatctatc agtacatgga tgacttgtac 
cgcaccaaga tcgaggagct gcgccagcac 
aagaagcacc agaaggagcc tcccttcctc 
tggaccgtgc agcctatcgt gctgccagag 
aagctggtgg ggaagttgaa ctgggccagt 
ctgtgcaaac tcctccgcgg aaccaaggca 
gccgagctcg aactggcaga aaaccgagag 
tatgacccct ccaaggacct gatcgccgag 
tatcagattt accaggagcc cttcaagaac 
ggtgcccaca ctaacgacgt caagcagctg 
agcatcgtga tctggggaaa gactcctaag 
gaaacctggt ggacagagta ttggcaggcc 
acccctcccc tggtgaagct gtggtaccag 
accttctacg tggatggggc cgctaacagg 
actaaccggg gcagacagaa ggttgtcacc 
ctgcaggcca tttacctcgc tttgcaggac 
tctcagtatg ccctgggcat cattcaagcc 
aatcagatca tcgagcagct gatcaagaag 
cacaaaggca ttggcggcaa tgagcaggtc 
gtgctattcc tggatggcat cgacaaggcc 
tggcgggcca tggctagcga cttcaacctg 
agctgtgaca agtgtcagct caagggcgaa 
ggcatctggc aactcgattg cacccatctg 
gtggccagtg gctacatcga ggccgaggtc 
tacttcctcc tgaagctggc aggccggtgg 



ccaacaggcc gccgcggaca ccggacacag 42 0 
gcagaacatc caggggcaga tggtgcacca 480 
ggtgaaggtg gtggaagaga aggcttttag 54 0 
gtcagaggga gccacccccc aagatctgaa 600 
ggccgccatg cagatgctga aggagaccat 660 
gcatccggtg cacgcagggc ccatcgcacc 72 0 
catcgccgga acgactagta cccttcagga 780 
catcccggtg ggagaaatct acaaacgctg 84 0 
catgtatagc cctaccagca tcctggacat 900 
ctacgtggac cggttctaca aaacgctccg 960 
ctggatgacc gaaaccctgc tggtccagaa 102 0 
ggccctgggc ccagcggcta ccctagagga 1080 
acccggccac aaggcacgcg tcctggctga 114 0 
catcatgatg cagcgcggca actttcggaa 12 00 
tggcaaagaa gggcacacag cccgcaactg 12 60 
atgtggaaag gaaggacacc aaatgaaaga 13 2 0 
gaagatctgg ccttcccaca agggaaggcc 13 8 0 
aacagcccca ccagaagaga gcttcaggtt 1440 
gcaggagccg atagacaagg aactgtatcc 150 0 
cgacccctcg tcacaataaa gatagggggg 1560 
gcagacgaca ccgtgctgga ggagatgtcg 162 0 
gggggaatcg gcggtttcat caaggtgcgc 1680 
ggccacaagg ctatcggtac cgtgctggtg 1740 
aacctgttga cgcagatcgg ttgcacgctg 1800 
ccggtgaagc tgaagcccgg gatggacggc 1860 
gagaagatca aggcactggt ggagatttgc 192 0 
aagattgggc ctgagaaccc gtacaacacg 198 0 
acgaaatggc gcaagctggt ggacttccgc 2 04 0 
gaggttcagc tgggcatccc gcaccccgca 2100 
ctggatgtgg gtgatgccta cttctccgtt 2160 
gccttcacaa tcccttcgat caacaacgag 2220 
ctgccccagg gctggaaagg ctctcccgca 22 80 
gagcctttcc gcaaacagaa ccccgacatc 2340 
gtgggctctg atctagagat agggcagcac 24 0 0 
ctgttgaggt ggggactgac cacacccgac 2460 
tggatgggtt acgagctgca ccctgacaaa 2 52 0 
aaagacagct ggactgtcaa cgacatacag 25 80 
cagatttacc cagggattaa ggtgaggcag 2640 
ctcacagagg tgatccccct aaccgaggag 2700 
atcctaaagg agcccgtgca cggcgtgtac 2760 
atccagaagc aggggcaagg ccagtggacc 282 0 
ctgaagaccg gcaagtacgc ccggatgagg 2 880 
accgaggccg tgcagaagat caccaccgaa 2 94 0 
ttcaagctgc ccatccagaa ggaaacctgg 3000 
acctggattc ctgagtggga gttcgtcaac 3060 
ctggagaagg agcccatagt gggcgccgaa 312 0 
gagactaagc tgggcaaagc cggatacgtc 3180 
ctcactgaca ccaccaacca gaagactgag 3240 
tcgggcctgg aggtgaacat cgtgacagac 3 3 00 
cagccagacc agagtgagtc cgagctggtc 33 60 
gaaaaggtct atctggcctg ggtacccgcc 3420 
gacaagctgg tctcggctgg catcaggaag 34 80 
caggacgagc acgagaaata ccacagcaac 354 0 
ccccctgtgg tggccaaaga gatcgtggcc 3600 
gccatgcatg gccaggtgga ctgtagcccc 3660 
gagggcaagg ttatcctggt agccgtccat 3 72 0 
attcccgccg aaacagggca ggagacagcc 378 0 
ccagtgaaga ccatccatac tgacaatggc 3 84 0 
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agcaatttca ccagtgctac ggttaaggcc gcctgctggt gggcgggaat caagcaggag 3 900 
ttcgggatcc cctacaatcc ccagagtcag ggcgtcgtcg agtctatgaa taaggagtta 3 960 
aagaagatta tcggccaggt cagagatcag gctgagcatc tcaagaccgc ggtccaaatg 4020 
gcggtattca tccacaattt caagcggaag ggggggattg gggggtacag tgcgggggag 4 0 80 
cggatcgtgg acatcatcgc gaccgacatc cagactaagg agctgcaaaa gcagattacc 4140 
aagattcaga atttccgggt ctactacagg gacagcagaa atcccctctg gaaaggccca 42 00 
gcgaagctcc tctggaaggg tgagggggca gtagtgatcc aggataatag cgacatcaag 42 60 
gtggtgccca gaagaaaggc gaagatcatt agggattatg gcaaacagat ggcgggtgat 43 2 0 
gattgcgtgg cgagcagaca ggatgaggat tag 4 3 53 



<210> 14 
<211> 4327 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pSYNGP4 -codon 
optimised HIV-1 gagpol with 20bp of the leader 
sequence of HIV-1 

<400> 14 

cggaggctag aaggagagag atgggcgccc gcgccagcgt gctgtcgggc ggcgagctgg 60 
accgctggga gaagatccgc ctgcgccccg gcggcaaaaa gaagtacaag ctgaagcaca 12 0 
tcgtgtgggc cagccgcgaa ctggagcgct tcgccgtgaa ccccgggctc ctggagacca 18 0 
gcgaggggtg ccgccagatc ctcggccaac tgcagcccag cctgcaaacc ggcagcgagg 24 0 
agctgcgcag cctgtacaac accgtggcca cgctgtactg cgtccaccag cgcatcgaaa 300 
tcaaggatac gaaagaggcc ctggataaaa tcgaagagga acagaataag agcaaaaaga 3 60 
aggcccaaca ggccgccgcg gacaccggac acagcaacca ggtcagccag aactacccca 42 0 
tcgtgcagaa catccagggg cagatggtgc accaggccat ctccccccgc acgctgaacg 4 80 
cctgggtgaa ggtggtggaa gagaaggctt ttagcccgga ggtgataccc atgttctcag 540 
ccctgtcaga gggagccacc ccccaagatc tgaacaccat gctcaacaca gtggggggac 600 
accaggccgc catgcagatg ctgaaggaga ccatcaatga ggaggctgcc gaatgggatc 66 0 
gtgtgcatcc ggtgcacgca gggcccatcg caccgggcca gatgcgtgag ccacggggct 72 0 
cagacatcgc cggaacgact agtacccttc aggaacagat cggctggatg accaacaacc 780 
cacccatccc ggtgggagaa atctacaaac gctggatcat cctgggcctg aacaagatcg 84 0 
tgcgcatgta tagccctacc agcatcctgg acatccgcca aggcccgaag gaaccctttc 900 
gcgactacgt ggaccggttc tacaaaacgc tccgcgccga gcaggctagc caggaggtga 960 
agaactggat gaccgaaacc ctgctggtcc agaacgcgaa cccggactgc aagacgatcc 102 0 
tgaaggccct gggcccagcg gctaccctag aggaaatgat gaccgcctgt cagggagtgg 10 8 0 
gcggacccgg ccacaaggca cgcgtcctgg ctgaggccat gagccaggtg accaactccg 114 0 
ctaccatcat gatgcagcgc ggcaactttc ggaaccaacg caagatcgtc aagtgcttca 12 0 0 
actgtggcaa agaagggcac acagcccgca actgcagggc ccctaggaaa aagggctgtt 12 6 0 
ggaaatgtgg aaaggaagga caccaaatga aagattgtac tgagagacag gctaattttt 132 0 
tagggaagat ctggccttcc cacaagggaa ggccagggaa ttttcttcag agcagaccag 1380 
agccaacagc cccaccagaa gagagcttca ggtttgggga agagacaaca actccctctc 1440 
agaagcagga gccgatagac aaggaactgt atcctttagc ttccctcaga tcactctttg 1500 
gcagcgaccc ctcgtcacaa taaagatagg ggggcagctc aaggaggctc tcctggacac 15 6 0 
cggagcagac gacaccgtgc tggaggagat gtcgttgcca ggccgctgga agccgaagat 162 0 
gatcggggga atcggcggtt tcatcaaggt gcgccagtat gaccagatcc tcatcgaaat 1680 
ctgcggccac aaggctatcg gtaccgtgct ggtgggcccc acacccgtca acatcatcgg 174 0 
acgcaacctg ttgacgcaga tcggttgcac gctgaacttc cccattagcc ctatcgagac 18 0 0 
ggtaccggtg aagctgaagc ccgggatgga cggcccgaag gtcaagcaat ggccattgac 18 6 0 
agaggagaag atcaaggcac tggtggagat ttgcacagag atggaaaagg aagggaaaat 192 0 
ctccaagatt gggcctgaga acccgtacaa cacgccggtg ttcgcaatca agaagaagga 1980 
ctcgacgaaa tggcgcaagc tggtggactt ccgcgagctg aacaagcgca cgcaagactt 2 04 0 
ctgggaggtt cagctgggca tcccgcaccc cgcagggctg aagaagaaga aatccgtgac 2100 
cgtactggat gtgggtgatg cctacttctc cgttcccctg gacgaagact tcaggaagta 2160 
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cactgccttc acaatccctt cgatcaacaa 
cgtgctgccc cagggctgga aaggctctcc 
cctggagcct ttccgcaaac agaaccccga 
gtacgtgggc tctgatctag agatagggca 
gcacctgttg aggtggggac tgaccacacc 
cctctggatg ggttacgagc tgcaccctga 
agagaaagac agctggactg tcaacgacat 
cagtcagatt tacccaggga ttaaggtgag 
ggcactcaca gaggtgatcc ccctaaccga 
agagatccta aaggagcccg tgcacggcgt 
cgagatccag aagcaggggc aaggccagtg 
gaacctgaag accggcaagt acgcccggat 
gctgaccgag gccgtgcaga agatcaccac 
taagttcaag ctgcccatcc agaaggaaac 
ggccacctgg attcctgagt gggagttcgt 
ccagctggag aaggagccca tagtgggcgc 
cagggagact aagctgggca aagccggata 
caccctcact gacaccacca accagaagac 
ggactcgggc ctggaggtga acatcgtgac 
agcccagcca gaccagagtg agtccgagct 
gaaggaaaag gtctatctgg cctgggtacc 
ggtcgacaag ctggtctcgg ctggcatcag 
ggcccaggac gagcacgaga aataccacag 
cctgccccct gtggtggcca aagagatcgt 
cgaagccatg catggccagg tggactgtag 
tctggagggc aaggttatcc tggtagccgt 
ggtcattccc gccgaaacag ggcaggagac 
gtggccagtg aagaccatcc atactgacaa 
ggccgcctgc tggtgggcgg gaatcaagca 
tcagggcgtc gtcgagtcta tgaataagga 
tcaggctgag catctcaaga ccgcggtcca 
gaaggggggg attggggggt acagtgcggg 
catccagact aaggagctgc aaaagcagat 
cagggacagc agaaatcccc tctggaaagg 
ggcagtagtg atccaggata atagcgacat 
cattagggat tatggcaaac agatggcggg 
ggattag 



cgagacaccg gggattcgat atcagtacaa 2220 
cgcaatcttc cagagtagca tgaccaaaat 2280 
catcgtcatc tatcagtaca tggatgactt 2340 
gcaccgcacc aagatcgagg agctgcgcca 24 00 
cgacaagaag caccagaagg agcctccctt 2460 
caaatggacc gtgcagccta tcgtgctgcc 2520 
acagaagctg gtggggaagt tgaactgggc 258 0 
gcagctgtgc aaactcctcc gcggaaccaa 2 64 0 
ggaggccgag ctcgaactgg cagaaaaccg 27 00 
gtactatgac ccctccaagg acctgatcgc 27 60 
gacctatcag atttaccagg agcccttcaa 2820 
gaggggtgcc cacactaacg acgtcaagca 2 8 80 
cgaaagcatc gtgatctggg gaaagactcc 2940 
ctgggaaacc tggtggacag agtattggca 3 00 0 
caacacccct cccctggtga agctgtggta 3 060 
cgaaaccttc tacgtggatg gggccgctaa 312 0 
cgtcactaac cggggcagac agaaggttgt 3180 
tgagctgcag gccatttacc tcgctttgca 3240 
agactctcag tatgccctgg gcatcattca 3300 
ggtcaatcag atcatcgagc agctgatcaa 3 3 60 
cgcccacaaa ggcattggcg gcaatgagca 342 0 
gaaggtgcta ttcctggatg gcatcgacaa 34 80 
caactggcgg gccatggcta gcgacttcaa 3 54 0 
ggccagctgt gacaagtgtc agctcaaggg 3 60 0 
ccccggcatc tggcaactcg attgcaccca 3660 
ccatgtggcc agtggctaca tcgaggccga 3 72 0 
agcctacttc ctcctgaagc tggcaggccg 3780 
tggcagcaat ttcaccagtg ctacggttaa 3 840 
ggagttcggg atcccctaca atccccagag 3900 
gttaaagaag attatcggcc aggtcagaga 3 96 0 
aatggcggta ttcatccaca atttcaagcg 4 02 0 
ggagcggatc gtggacatca tcgcgaccga 4 08 0 
taccaagatt cagaatttcc gggtctacta 414 0 
cccagcgaag ctcctctgga agggtgaggg 42 0 0 
caaggtggtg cccagaagaa aggcgaagat 42 60 
tgatgattgc gtggcgagca gacaggatga 4320 

4327 



<210> 15 
<211> 22 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Illustrative 
helix II sequence 

<400> 15 

cugaugaggc cgaaaggccg aa 



<210> 16 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
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<400> 16 

uaguaagaau guauagcccu ac 



<210> 17 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 17 

aacccagauu guaagacuau uu 



<210> 18 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 18 

uguuucaauu guggcaaaga ag 



<210> 19 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 19 

aaaaagggcu guuggaaaug ug 



<210> 20 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 20 

acgaccccuc gucacaauaa ag 



<210> 21 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 21 

ggaauuggag guuuuaucaa ag 



<210> 22 

<211> 22 

<212> RNA 

<213> Human immunodeficiency virus type 1 



<400> 22 

auauuuuuca guucccuuag au 
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<210> 23 
<211> 22 
<212> RMA 

<213> Human immunodeficiency „virus type 1 
<400> 23 

uggaugauuu guauguagga uc 



<210> 24 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 



<400> 24 

cuuuggaugg guuaugaacu cc 



<210> 25 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 25 

cagcuggacu gucaaugaca ua 



<210> 26 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 26 

aacuuucuau guagaugggg ca 



<210> 27 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 27 

aaggccgccu guuggugggc ag 



<210> 28 
<211> 22 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 28 

uaagacagca guacaaaugg ca 



<210> 29 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Primer 
<400> 29 

cagctgctcg agcagctgaa gcttgcatgc 



<210> 30 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 30 

gtaagttatg taacggacga tatcttgtct tctt 34 



<210> 31 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 31 

cgcatagtcg acgggcccgc cactgctaga gattttc 



<210> 32 
<211> 116 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 32 

tcgaggtcga ctggtggaca gggaaggatt cgaaccttcg aagtcgatga cgtagagaaa 6 0 
aaatggtggc agtagaagga ttcgaacctt cgaagtcgat gacgtcatcc ccgggc 116 



<210> 33 
<211> 110 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 33 

tcgaggtcga ctggtggaac tggaaggatt cgaaccttcg aagtcgatga cgttcctaaa 60 
aaatggtgaa tcatgaagga ttcgaacctt cgaagtcgat gacgtaatac 110 
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<210> 34 
<211> 110 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 34 

tcgaggtcga ctggtgggcc ccgaaggatt cgaaccttcg aagtcgatga cgtggaaaaa 60 
aaatggtggg aagagaagga ttcgaacctt cgaagtcgat gacgttggcc 11 



<210> 35 
<211> 110 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 35 

tcgaggtcga ctggtgacag cagaaggatt cgaaccttcg aagtcgatga cgttcagaaa 60 
aaatggtgaa gcaagaagga ttcgaacctt cgaagtcgat gacgtagccc 11 



<210> 36 
<211> 110 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 36 

tcgaggtcga ctggtgtaag aagaaggatt cgaaccttcg aagtcgatga cgttataaaa 60 
aaatggtgac cggtgaagga ttcgaacctt cgaagtcgat gacgttatac 11 



<210> 37 
<211> 116 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 37 

tcgaggcatg cgtcgactgg tgggcctaga aggattcgaa ccttcgaagt cgatgacgtg 60 
cacaaaaaat ggtgaactac gaaggattcg aaccttcgaa gtcgatgacg tgtacc 11 



V 
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<210> 38 
<211> 12 
<212> DNA 

<213> Human immunodeficiency virus type 1 
<400> 38 

atgggtgcga ga 12 



<210> 39 
<211> 12 
<212> DNA 

<213> Human immunodeficiency virus type 1 
<400> 39 

gatgaggatt ag 12 



<210> 40 
<211> 12 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

gagpol-SYNgp-codon optimised gagpol sequence 

<400> 40 

atgggcgccc gc 12 



<210> 41 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

gagpol-SYNgp-codon optimised gagpol sequence 

<400> 41 

gatgaggatt ag 12 



<210> 42 
<211> 12 
<212> DNA 

<213> Human immunodeficiency virus type 1 
<400> 42 

atgagagtga ag 12 



<210> 43 
<211> 12 
<212> DNA 

<213> Human immunodeficiency virus type 1 
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<400> 43 

gctttgctat aa 12 

<210> 44 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

SYNgp-16 0nm-codon optimised env sequence 

<400> 44 

atgagggtga ag 12 

<210> 45 

<211> 12 

<212> DNA 

<213> Artificial Sequence 

^ <220> 

<223> Description of Artificial Sequence: 
yy SYNgp-160nTn-codon optimised env sequence 

on 

\M <400> 45 

\J gcgctgctgt aa 12 

L <210> 46 

Wi <211> 34 

\^ <212> RNA 

^ <213> Human immunodeficiency virus type 1 

O <400> 46 

ggcucgaacu ugucgugguu aucguggaug uguc 34 

<210> 47 
<211> 63 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: EGS based on 
Tyrosol t-RNA 

<400> 47 

cgauagcaga cucuaaaucu gccgucaucg acuucgaagg uucgaauccu ucccaggaca 60 
cca 63 



<210> 48 
<211> 66 
<212> RNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Consensus EGS 
sequence 

<220> 

<221> modif ied_base 

<222> (1) . . (7) 

<223> Any nucleotide 

<220> 

<221> modif ied_base 

<222> (56) . . (61) 

<22 3> Any nucleotide 

<400> 48 

nnnnnnnagc agacucuaaa ucugccguca ucgacuucga agguucgaau ccuucnnnnn 60 
ncacca 



<210> 49 
<211> 49 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Consensus EGS 
sequence 

<220> 

<221> modif ied_base 

<222> (1) . . (7) 

<223> Any nucleotide 

<220> 

<221> modif ied_base 
<222> (39) . . (44) 
<223> Any nucleotide 

<400> 49 

nnnnnnnacg ucaucgacuu cgaagguucg aauccuucnn nnnncacca 



<210> 50 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 

<400> 50 
gggccuauag cac 



<210> 51 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 



<400> 51 
gaacuacuag uac 



13 
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<210> 52 
<211> 13 
<212> RNA 

<213> Human immunodef iciency virus type 1 

<400> 52 
guaagaaugu aua 

<210> 53 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 

<400> 53 
gaccgguucu aua 

<210> 54 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 

<400> 54 
gacagcaugu cag 



<210> 55 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 

<400> 55 
gaagcaauga gcc 

<210> 56 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 

<400> 56 
gggccccuag gaa 



<210> 57 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 

<400> 57 
gggaagaucu ggc 



<210> 58 
<211> 13 
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<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 58 

ggaacuguau ecu 13 



<210> 59 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 59 

gaaucuauga aua 13 



<210> 60 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 60 

ggacagguaa gag 13 



<210> 61 
<211> 13 
<212> RNA 

<213> Human immunodeficiency virus type 1 
<400> 61 

ggcaguauuc auc 13 



<210> 62 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Combined DNA/RNA Molecule: Anti-HIV 
EGS construct 

<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 62 

gtgcacguca ucgacuucga agguucgaau ccuucuaggc ccacca 4 6 



<210> 63 

<211> 46 

<212> DNA 

<213> Artificial 



Sequence 
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<220> 

<223> Description of Combined DNA/RNA Molecule: Anti-HIV 
EGS construct 

<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 63 

gtacacguca ucgacuucga agguucgaau ccuucguagu ucacca 4 6 



<210> 64 
<211> 46 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 64 

uauaacguca ucgacuucga agguucgaau ccuucuucuu acacca 46 



<210> 65 
<211> 46 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 65 

uauaacguca ucgacuucga agguucgaau ccuucaccgg ucacca 4 6 



<210> 66 
<211> 46 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 66 

cugaacguca ucgacuucga agguucgaau ccuucugcug ucacca 46 



<210> 67 

<211> 46 

<212> RNA 

<213> Artificial 



Sequence 
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<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 67 

ggcuacguca ucgacuucga agguucgaau ccuucuugcu ucacca 



<210> 68 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Combined DNA/RNA Molecule: Anti-HIV 
EGS construct 

<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 68 

ttccacguca ucgacuucga agguucgaau ccuucggggc ccacca 



<210> 69 
<211> 46 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 69 

gccaacguca ucgacuucga agguucgaau ccuucucuuc ccacca 



<210> 70 

<211> 46 

<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 70 

aggaacguca ucgacuucga agguucgaau ccuuccaguu ccacca 



<210> 71 

<211> 46 

<212> RNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 71 

uauuacguca ucgacuucga agguucgaau ccuucuagau ucacca 46 

<210> 72 
<211> 46 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 

<400> 72 

cucuacguca ucgacuucga agguucgaau ccuucccugu ccacca 4 6 

<210> 73 
<211> 46 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Anti-HIV EGS 
construct 



<400> 73 

gaugacguca ucgacuucga agguucgaau ccuucuacug ccacca 
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